1. Field of the Invention
The present invention relates to an Internet search system. In particular, the invention relates to a search system and method for facilitating an Internet search for web sites having desired information, by providing the image of a web document to a user before the user actually accesses the web document from a corresponding server.
2. Description of the Related Art
Web documents (also known as web pages) are electronic files that contain many forms of information, including text, graphics, video, audio, and links to other Web documents. Presently, Hypertext Mark-up Language (HTML) is the standard format for documents on the World Wide Web. An HTML formatted document has HTML codes, i.e., commands, embedded in the document. A software application known as a Web browser is used to access documents on the Web. By understanding HTML, Web browser software can properly display a Web document on a user's output display device. More importantly, Web browser software interprets the HTML commands in an HTML formatted Web document to navigate a link in that Web document to another Web document. The Web browser thereby provides automatic access to other Web documents on the Web.
A web site is where a collection of web documents on a particular topic are stored on a server and identified by a URL. The first document transmitted to users who access the web site is called a home page. From the home page, users can get to all the lined documents on the web site by “clicking” links the documents. Users can locate web sites of interest by using a search engine offered by Internet companies at their web sites.
Typically, a search system run by those “portal” internet sites comprises a search engine and a classified directory table. Well known are Yahoo!, Lycos, Infoseek, etc. Typical search engines include a robot agent, an index program and a search program. The robot agent, also know as software-implemented web crawlers, automatically visits web sites, and trace hypertext links therein, in seriatim and extracts, abstracts and index each document encountered therein, through so-called key words, into a large database for subsequent access. The index program extracts and registers indexes for the collected web documents. The search program provides a list of web sites that it determines relate to a search query sent from the user, based on predetermined criteria. The directory table classifies the collected web sites by subject in many depth levels. One can narrow down from a broad subject/category to its successive sub-categories. The indexes for the classified directory are predetermined and registered in the search system.
However, conventional search systems record only limited information on the web sites they visit, e.g., web addresses, i.e., URLs through which the corresponding document can be accessed by a web browser, content words, titles and short summaries of the contents, and possibly the description of the document as provided in its HTML description field. The brief description of a web site is commonly written by the operator of the web site as introduction to the web site. In most cases, however, it is very difficult for a user to know whether a web site includes information he or she wants or not simply by reading its brief description.
Hence, in the conventional search system, the user has to access a searched web site to see the contents of the web documents of the site. If the contents do not have sought for information, the user would return the search site to view the search result to select another search web site. This search process is repeated until the user lands on a useful web site or decides to quit. As a result, the user is likely to spend much time and effort in navigating a number of web sites before he or she locates a desired web site. And, even when the user reaches the proper web site, the user would not recognize that if the web site were currently out of service.
Therefore, what is needed is a way to find out what the actual contents of web documents would be before visiting each of a list of web sites provided by a search engine.