The present invention relates generally to information retrieval in a data processing system. More particularly it relates to an improved interface and method for performing a search of a document database such as the Internet.
It is well known to connect a plurality of computer systems into a network of computer systems. In this way, the collective resources available within the network may be shared among users, thus allowing each connected user to enjoy resources which would not be economically feasible to provide to each user individually. With the growth of the Internet, sharing of computer resources has been brought to a much wider audience; it has become a cultural medium in today's society for both information and entertainment. Government agencies employ Internet sites for a variety of informational purposes. For many companies, their Internet sites are an integral part of their business; they are frequently mentioned in the companies' television, radio and print advertising.
The World Wide Web, or simply "the web", is the Internet's multimedia information retrieval system. It is the most commonly used method of transferring data in the Internet environment. Other methods exist such as the File Transfer Protocol (FTP) and Gopher, but have not achieved the popularity of the Web. Client machines accomplish transactions to Web servers using the Hypertext Transfer Protocol (HTTP), which is a known application protocol providing users access to files, e.g,, text, graphics, images, sound, video, using a standard page description language known as the Hypertext Markup Language (HTML). HTML provides basic document formatting and allows the developer to specify "links" to other servers and files. In the Internet paradigm, a network path to a server is identified by a Uniform Resource Locator (URL) having a special syntax for defining a network connection.
Retrieval of information is generally achieved by the use of an HTML-compatible "browser", e.g., Netscape Navigator, at a client machine. When the user of the browser specifies a link via a URL, the client issues a request to a naming service to map a hostname in the URL to a particular network IP address at which the server is located. The naming service returns a list of one or more IP addresses that can respond to the request. Using one of the IP addresses, the browser establishes a connection to a server. If the server is available, it returns a document or other object formatted according to HTML.
One of the most common, and frequently, one of the most disappointing activities performed on the Internet is searching among the plethora of information available at the various web servers for the particular information in which the user is interested. There are a variety of search engines available, including, Alta Vista, Lycos, HotBot aa well as the various search engines attached to the individual web servers themselves.
One of the difficulties is that the formation of a search argument is a difficult task for many users. Boolean arguments, e.g., OR, AND, NOT, AND NOT, are difficult to master. The quality of the search is thus dependent on the skill and vocabulary of the user. Even with such mastery, in many cases, it is difficult to properly anticipate the exact words which will be used by the writers of particular document. The addition of a thesaurus would be helpful, but in many cases the vocabulary is technical or specialized which simply would not be found in a general purpose thesaurus. The result of much of an Internet search could most charitably be called "useless". Other terms, some of which are expletives, are known to the art.
Yet among the useless information, there are generally some pearls. A user upon reading such a document, recognizes its worth to his desires and wishes, typically futilely, that he could have more documents like this. He could manually look at the document and attempt to formulate a new search using new words in the desired document. The new search might be better, or similarly dismal to the first.
It would be preferable to provide a user a convenient means to quickly refine an Internet search with a minimum of manual search formulation and keyboard input. The present invention provides one solution to this problem.