An efficient and practical means of obtaining relevant information and also screening unwanted/uninteresting information has been an ongoing need, especially since the inception of the internet. This need is particularly acute at present due to the exponential growth in the number of world-wide web sites and the sheer volume of information contained therein. In an attempt to index the information available on the internet, a number of software search engines have been created via which a user enters a search command comprised of suitable keywords from a keyboard at his personal computer. The search command is transmitted to a server computer, that has a search engine associated with the server computer. The search engine receives the search command, and then using it scans for these key words through a database of web addresses and the text stored on the web sites. Thereafter, the results of the scan are transmitted from the server computer back to the user's computer and displayed on the screen of the user's computer.
In order for the search engine to be aware of new web sites and to update its records of existing sites, either the proprietors of the web sites notify the search engine themselves or the information may be obtained via a ‘web crawler’ to update the database at the server computer. A web crawler is an automated program which explores and records the contents of a web site and its links to other sites, thereby spreading between sites in an attempt to index all the current sites.
This database structure and method of searching it poses some significant difficulties. The internet growth-rate has resulted in a substantial backlog in the scanning of new sites, notwithstanding the fact that web sites are frequently deleted, re-addressed, updated and so forth, thus leaving the search engine with outdated and/or misleading information. Although the web crawlers can be configured to prioritize possible key-words according to their location (title, embedded link, address etc), nevertheless, depending on the type of search engine used, substantial portions of the web site text (often involving the majority or even all of the site text) is still required to be scanned. This results in colossal storage requirements for the search engine. Furthermore, a typical key word search may bring up an excessively large volume of material, the majority of which may be of little interest to the user. The user typically makes a selection from the list based on the brief descriptions of the site and explores the chosen sites until the desired information is located.
These results are in the form of a list, ranked according to criteria specific to the search engine. These criteria may range from the number of occurrences of the key-words anywhere within the searched text, to methods giving a weighting to key-words used in particular positions (as previously mentioned). When multiple key-words have been used, sites are also ranked according to the number of different key-words applicable. A fundamental drawback of all these ranking systems is their objectivity—they are determined according to the programmed criteria of the search engine, and the emphasis placed on particular types of site design, rather than any measure of the actual users' opinions. Indeed this can lead to the absurd situation whereby in an attempt to ensure a favorable rating by the most commonly used search engines, some designers deliberately configure their sites in the light of the previously mentioned criteria, to the detriment of the presentation, readability and content of the site.