In the prior art, it has been well known that computer systems can be used to index databases, and to search the index to locate records qualified by queries. In recent years, a unique distributed database has emerged in the form of the World-Wide-Web (Web). The database records of the Web are in the form of pages accessible via the Internet. Here, tens of millions of pages are accessible by anyone having a communications link to the Internet.
The pages are dispersed over millions of different computer systems all over the world. Users of the Internet constantly desire to locate specific pages containing information of interest. The pages can be expressed in any number of different character sets such as English, French, German, Spanish, Cyrillic, Kanakata, and Mandarin. In addition, the pages can include specialized components, such as embedded "forms," executable programs, JAVA applets, and hypertext.
Moreover, the pages can be constructed using various formatting conventions, for example, ASCII text, Postscript files, html files, and Acrobat files. The pages can include links to multimedia information content other than text, such as audio, graphics, and moving pictures.
Search engines have been provided to allow users to locate Web pages of interest. These search engines typically have a query interface where the users specify terms and operators which they want to use to qualify pages.
There are a number of problems with presenting pages located by searching an index to the Web. First, the number of pages accessible through the Web is very large, so the number of qualifying pages can also be large. In addition, many Web users are unsophisticated, so there is a large likelihood that queries will be loosely specified, thereby yielding many pages which may not be of interest to the users. The number of qualifying pages may number in the tens of thousands.
It is desired to present search results in a usable manner so that users are not burdened with perusing all qualifying records.