The IT Law Wiki
No edit summary
No edit summary
Line 1: Line 1:
  +
== Overview ==
A traditional '''search engine''' is a [[software application]] that examines as many pages as possible on [[website]]s, compiling a list of the location of each word on each page. The search engine then create a full-text index of the [[Internet]].
 
   
 
A traditional '''search engine''' is a [[software application]] that examines as many pages as possible on [[website]]s, compiling a list of the location of each word on each page. The search engine then create a full-text index of the [[Internet]].
A search engine starts with a list of one or more [[website]]s. The engine then requests the [[home page]] from each [[website|site]] on its list. When a [[home page]] is retrieved that has [[link]]s to yet other pages, the search engine requests a copy of each of those pages that these [[link]]s point to. And if those pages in turn contain [[link]]s to yet more pages, the search [[software]] requests a copy of those pages. And so on, day after day, ceaselessly.
 
   
  +
== How it works ==
At its most basic level, a search engine maintains a list, for every word, of all known [[Web page]]s containing that word. The collection of lists is known as an "index." Search engines vary according to the size of the index, the frequency of updating the index, the search options, the speed of returning a result, the relevancy of the results, and the overall ease of use. No two search engines work the same way.
 
  +
 
A search engine starts with a list of one or more [[website]]s. The engine then requests the [[home page]] from each [[website|site]] on its list. When a [[home page]] is retrieved that has [[link]]s to yet other pages, the search engine requests a copy of each of those [[page]]s that these [[link]]s point to. And if those [[page]]s in turn contain [[link]]s to yet more [[page]]s, the search [[software]] requests a [[copy]] of those [[page]]s. And so on, day after day, ceaselessly.
  +
 
At its most basic level, a search engine maintains a list, for every word, of all known [[Web page]]s containing that word. The collection of lists is known as an "[[keyword]] index." Search engines vary according to the size of the index, the frequency of updating the index, the search options, the speed of returning a result, the relevancy of the results, and the overall ease of use. No two search engines work the same way.
   
 
In practice, most search engines do not exhaustively cover all possible [[website]]s. In addition, some search engines pass along material for review by human editors, who rate the pages retrieved on a variety of scales — quality, appropriateness for families, and so on. The creation of such an annotated index obviously takes longer than it does to create a comparable unannotated index. Search engines are the primary means by which [[Internet user]]s can find [[digital]] [[information]].
 
In practice, most search engines do not exhaustively cover all possible [[website]]s. In addition, some search engines pass along material for review by human editors, who rate the pages retrieved on a variety of scales — quality, appropriateness for families, and so on. The creation of such an annotated index obviously takes longer than it does to create a comparable unannotated index. Search engines are the primary means by which [[Internet user]]s can find [[digital]] [[information]].

Revision as of 09:14, 23 December 2009

Overview

A traditional search engine is a software application that examines as many pages as possible on websites, compiling a list of the location of each word on each page. The search engine then create a full-text index of the Internet.

How it works

A search engine starts with a list of one or more websites. The engine then requests the home page from each site on its list. When a home page is retrieved that has links to yet other pages, the search engine requests a copy of each of those pages that these links point to. And if those pages in turn contain links to yet more pages, the search software requests a copy of those pages. And so on, day after day, ceaselessly.

At its most basic level, a search engine maintains a list, for every word, of all known Web pages containing that word. The collection of lists is known as an "keyword index." Search engines vary according to the size of the index, the frequency of updating the index, the search options, the speed of returning a result, the relevancy of the results, and the overall ease of use. No two search engines work the same way.

In practice, most search engines do not exhaustively cover all possible websites. In addition, some search engines pass along material for review by human editors, who rate the pages retrieved on a variety of scales — quality, appropriateness for families, and so on. The creation of such an annotated index obviously takes longer than it does to create a comparable unannotated index. Search engines are the primary means by which Internet users can find digital information.

A recent area of development is search engines that are specifically designed to build profiles of individuals based on personal data found on the Internet.

A search engine will find all web pages on the Internet with a particular word or phrase. Given the current state of search engine technology, that search will often produce a list of hundreds of web sites through which the user must sort in order to find what he or she is looking for. As a result, companies strongly prefer that their domain name be comprised of the company or brand trademark and the suffix .com.[1]

References

  1. Sporty's Farm L.L.C. v. Sportsman's Market, Inc., 202 F.3d 489, 493, 53 U.S.P.Q.2d (BNA) 1570 (2d Cir. 2000)(full-text).