1. Incorporation by Reference
This patent application discloses an invention which may optionally form a portion of a larger system. Other portions of the larger system are disclosed and described in the following co-pending patent applications, all of which are subject to an obligation of assignment to the same person. The disclosures of these applications are herein incorporated by reference in their entireties.
METHOD AND SYSTEM FOR AUTOMATIC HARVESTING AND QUALIFICATION OF DYNAMIC DATABASE CONTENT, William J. Bushee, Thomas W. Tiahrt, and Michael K. Bergman, and Filed Jul. 24, 2001, application Ser. No. 09/911,522, pending.
METHOD FOR AUTOMATIC SELECTION OF DATABASES FOR SEARCHING, William J. Bushee, Filed Jul. 24, 2001, application Ser. No. 09/911,452, now U.S. Pat. No. 6,711,569, issued Mar. 23, 2004.
AUTOMATIC SYSTEM FOR CONFIGURING TO DYNAMIC DATABASE SEARCH FORMS, William J. Bushee, Filed Jul. 24, 2001, application Ser. No. 09/911,435, pending.
2. Field of the Invention
The present invention relates to search engines and more particularly pertains to a new system and method for efficient control and capture of dynamic database content for rapidly providing a user with a highly relevant collection of documents related to a query.
3. Description of the Prior Art
The Internet is a worldwide system of computer networks in which users at any one computer may get information located on virtually any other computer with appropriate authorization. The Internet uses a set of protocols called Transmission Control Protocol/Internet Protocol or TCP/IP. The World Wide Web (often abbreviated as WWW) is a portion of the Internet using hypertext as a method for rapid cross-referencing that links one document or site to another.
A database is a collection of data, which is organized in a manner that allows its contents to be easily accessed, managed, and updated. Given this definition an Internet site can be viewed as a database with a collection of data that can be viewed as pages, or accessible documents. Similarly, any network for accessing documents can be considered a database, including intranets and extranets. These network databases can be either static or dynamic. A static network database provides the same set of documents or pages to every user. A dynamic network database presents unique documents or pages to different users, typically as a response to the users' queries.
The use of search engines is known in the prior art. The Internet, as well as the predecessor ARPANET, has since its inception held the promise of real-time access to an almost inexhaustible supply of information, stored on computers throughout the world. Sorting through the information available to find documents relevant to a given question or query can be laborious; and a method to speed this process is needed. Search engines allow a user to search for sites that have one or more keywords corresponding to the user's query. This development has sped up the process of finding sites, but has not necessarily improved the quality of the results. While it is true that millions of documents are readily available as static pages to users through search engines, much more of the total content of the Internet has remained in the shadows. This remaining content, while available, often requires independent knowledge of the exact location of the document, sophisticated search techniques, or in many cases the use of professional researchers to attempt to “mine” the needed information.
Search engines have been improved through the use of link-followers also known as “crawlers”, which allow a search engine to follow links on a known web page to discover other web pages as new sources of information and to build an index. Crawlers are an improvement over conventional search engines in that they can provide more sites that are relevant to a given question or query. But again, as was the case with conventional search engines, only static pages have been available as results to the user. Some of the static pages may be entry-points for databases, which can provide very relevant and detailed information by continued searching. However the use of these entry points conventionally requires the laborious task of manually entering the user's question in the specific data-entry windows for each database, capturing the results, and then analyzing the results from each database for relevancy.