The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices, and computer systems may be found in many different settings. Computer systems typically include a combination of hardware, such as semiconductors and circuit boards, and software, also known as computer programs. As advances in semiconductor processing and computer architecture push the performance of the computer hardware higher, more sophisticated and complex computer software has evolved to take advantage of the higher performance of the hardware, resulting in computer systems today that are much more powerful than just a few years ago.
Years ago, computers were isolated devices that did not communicate with each other. But, today computers are often connected in networks, such as the Internet or World Wide Web, and a user at one computer, often called a client, may wish to access information at multiple other computers, often called servers, via a network. Searching is the primary mechanism used to retrieve information from the Internet. Users typically search the web pages of the Internet using a search engine, such as AltaVista, Yahoo, or Google. These search engines index hundreds of millions of web pages and respond to tens of millions of queries every day.
To accomplish this formidable task, search engines typically employ three major elements. The first is an agent, often called a spider, robot, or crawler. The crawler visits a web page, reads it, and then follows links to other pages within the site. The crawler typically returns to the site on a regular basis, such as every month or two, to look for changes. The crawler stores the information it finds in the second part of the search engine, which is the index. Sometimes new pages or changes that the crawler finds may take some time to be added to the index. Thus, a web page may have been “crawled” but not yet “indexed.” Until the web page has been added to the index, the web page is not available to those searching with the search engine.
The third part of the search engine is the program that interrogates the millions of pages recorded in the pre-created index to find matches to a search and ranks them in order that the program believes is most relevant, which is often referred to as web site ranking. Web site ranking is extremely important to the user because a simple search using common terms may match thousands or even tens of thousands of pages, which would be virtually impossible for the user to individually sort through in an attempt to determine relevancy.
In order to aid the user, search engines typically determine relevancy by following a set of rules, which is commonly known as the web site ranking algorithm. Exactly how a particular search engine's algorithm works is usually a closely-guarded trade secret. But, all major search engines follow the same generally-accepted methods described below. One of the main methods in a web site ranking algorithm involves the location and frequency of keywords (search terms) on a web page, which is known as the location/frequency method. For example, web site-ranking algorithms often assume that terms appearing in a title control-tag are more relevant than terms appearing at other locations in the page. Further, many web site ranking algorithms will also check to determine whether the search keywords appear near the top of a web page, such as in the headline or in the first few paragraphs of text. They assume that a page relevant to the topic will mention those words at the beginning. Frequency of keywords is the other major factor that web site ranking algorithms use to determine relevancy. The web site ranking algorithm analyzes how often keywords appear in relation to other words in a web page and deems more relevant those with a higher frequency.
In addition to the location/frequency method, which is an on-the-page criteria, search engines also typically make use of off-the-page criteria. Off-the-page criteria are those that use data external to the page itself. Chief among these is link analysis. By analyzing how pages link to each other, the web site ranking algorithm attempts to determine both the subject of a page and the relative importance of the page with respect to other pages.
Hence, as previously described above, the web site ranking algorithm is a very sophisticated technique. Further, the web site ranking algorithm is largely hidden from the user who is requesting the search, who often has little or no control over the criteria used in the web site ranking algorithm. To the extent that the user has control over some of the criteria, adjusting a limited set of criteria via a text box is unintuitive, slow, cumbersome, and likely produces unexpected results.
Thus, without a better interface for controlling search criteria, users will continue to experience difficulty in searching.