The present invention relates to a searchable database and to systems for generating and utilizing same.
With the advent of the World Wide Web, individuals have at their disposal vast amounts of information on a variety of topics. As such, at present, the World Wide Web represents the largest single searchable database.
However, since such data is dispersed among a staggering number of Web sites, searching for such information can be a daunting task. To facilitate Web searching, a number of search tools, termed search engines, have been created, e.g. Google (www.google.com), Lycos (www.lycos.com), Alta Vista (www.altavista.com), etc.
The use of such search engines enables a user to receive information relating to Web accessible files of interest such as Web pages in accordance with a search query.
Most Internet search engines search for Web files, such as Web pages, video files (e.g., QuickTime™ movies), or music files (e.g., MP3). The results returned by the search engine (the result list) is a list of hyperlinks, that link to the Web files (e.g., Web pages) most relevant to the user's query(s).
Search engine queries are typically effected via keywords, optionally separated by Boolean operators (and, or, not), topics, such as, for example searching through a specific topic, or an index, which provides access to a specific topic.
For example, in keyword searches a user querying for “sports and (football or basketball) but not soccer” would typically receive a list of links to Web pages which contain the word “sports” and also contain either the word “football” or the word “basketball” (or both) but which do not contain the word “soccer”.
To enable searching, search engines build up databases, which index information on Web files. Such databases are generated by “Web spiders” (also known as “Web robots”, “Web crawlers”, “Web agents”, etc.) which constantly scan the World Wide Web in a random, semi-random, or rule-based manner.
Web spiders are computer programs that autonomously connect to World Wide Web addresses and categorize the information contained therein according to keywords, keyword frequency, font sizes, word placement inside documents, titles, images found, date of last modification, and/or any additional criteria. The categorized information generated is then stored by the search engine database.
Some search engines, which are referred to as “meta search engines” collect and display search result provided by one or more search engines (possibly after sorting and removing duplicate results). Examples include, MetaCrawler (www.metacrawler.com) and the like.
Thus, following query entry, a search engine uses the categorized information stored in its database to locate Web files such as Web pages of relevance. Links to Web pages of relevance are then presented to the user as a list (the result list) which includes a link to the Web page and typically also a short summary describing the Web file; the results list is typically sorted based on match accuracy.
Although such search engines facilitate World Wide Web searching, querying for specific information is oftentimes a trying experience, even when using the most sophisticated search tools available.
Because of the vast amount of information and of the dispersed nature thereof, search results are oftentimes either not specific enough or not accurate.
For example, keyword searches may yield irrelevant or no results if the defined keyword is too specific, or they can yield numerous results if the keywords used are too generic.
In any case, a user must either broaden the search or be forced to download numerous Web files in order to sort and uncover the information sought after.
In the latter case, such downloading and sorting can be a frustrating and time consuming endeavor especially in cases where the information sought after is not uncovered.
Oftentimes, even in searches which seemingly provide good results, download of multiple Web files is required since the information available in the summary of each result is not sufficient for determining the relevance of the Web file to the query made.
In addition, in the case of Web page searches, a user often accesses irrelevant or slightly relevant Web pages resulting from a search query in efforts to possibly uncover more relevant links within these pages, a practice which further prolongs a search and adds to the frustration of the user.
Another common problem encountered by users searching through the Web arises from the existence of several different hypertext links which point to the same Web page or site. Such duplicate links oftentimes contribute to redundancy in search results.
Yet another common problem encountered by users searching the Web arises from “broken” hyperlinks which appear in a search results list. Such hyperlinks, which cannot be used to link to the site they represent because that site is down, or no longer available increase the frustration experienced by users.
There is thus a widely recognized need for, and it would be highly advantageous to have, a system and method which would enable a user to rapidly asses the accuracy, relevancy, and content of results obtained from a search query and to easily access related Web files such as Web pages even when contained within a Web page directly uncovered by the search query.
Surfing the World Wide Web is oftentimes is often a tedious task as connectivity to some addressed servers may prove time inefficient or non-available, depending on, user load, maximal bandwidth, presently available bandwidth and other factors.
There is thus a widely recognized need for, and it would be highly advantageous to have, a system and method which will allow efficient Web surfing at all times.