This patent application discloses an invention which may optionally form a portion of a larger system. Other portions of the larger system are disclosed and described in the following patent applications, all of which are subject to an obligation of assignment to the same person. The disclosures of these applications are herein incorporated by reference in their entireties.
METHOD AND SYSTEM FOR AUTOMATIC HARVESTING AND QUALIFICATION OF DYNAMIC DATABASE CONTENT, William J. Bushee, Thomas W. Tiahrt, and Michael K. Bergman, and Filed Jul. 24, 2001, application Ser. No. 09/911,522 now pending.
AUTOMATIC SYSTEM FOR CONFIGURING TO DYNAMIC DATABASE SEARCH FORMS, William J. Bushee, Filed Jul. 24, 2001, application Ser. No. 09/911,435 now pending.
SYSTEM AND METHOD FOR EFFICIENT CONTROL AND CAPTURE OF DYNAMIC DATABASE CONTENT, William J. Bushee and Thomas W. Tiahrt, Filed Jul. 24, 2001, application Ser. No. 09/911,434 now pending.
SYSTEM FOR AUTOMATICALLY CATEGORIZING CONTENT IN HIERARCHICAL SUBJECT STRUCTURES, Thomas W. Tiahrt, Michael K. Bergman, and William J. Bushee, Filed Jul. 24, 2001, application Ser. No. 09/911,433 now pending.
SYSTEM AND METHOD FOR FLEXIBLE INDEXING OF DOCUMENT CONTENT, Thomas W. Tiahrt, Filed Jul. 24, 2001, application Ser. No. 09/911,432 now pending.
1. Field of the Invention
The present invention relates to search engines and more particularly pertains to a new method for automatic selection of databases for improving the efficiency of data capture and management systems.
2. Description of the Prior Art
The Internet is a worldwide system of computer networks in which users at any one computer may get information located on virtually any other computer with appropriate authorization. The Internet uses a set of protocols called Transmission Control Protocol/Internet Protocol or TCP/IP. The World Wide Web (often abbreviated as WWW) is a portion of the Internet using hypertext as a method for rapid cross-referencing that links one document or site to another.
A database is a collection of data, which is organized in a manner that allows its contents to be easily accessed, managed, and updated. Given this definition an Internet site can be viewed as a database with a collection of data that can be viewed as pages, or accessible documents. Similarly, any network for accessing documents can be considered a database, including intranets and extranets. These network databases can be either static or dynamic. A static network database provides the same set of documents or pages to every user. A dynamic network database presents unique documents or pages to different users, typically as a response to the users"" queries. Because of the similarity between web sites specifically and databases in general the terms document and web page are used synonymously throughout this document unless otherwise distinguished by context. Similarly, the terms search engine and database are also used synonymously throughout this document unless otherwise distinguished by context.
Many enterprises, whether business, governmental, or other coordinated undertakings, require large amounts of xe2x80x9ccurrentxe2x80x9d information to be analyzed and available for use in the daily execution of their activities. The Internet has made the availability information in near real time a reality. However, this very current information is distributed across several thousand, if not millions, of computer systems linked to the Internet. Additionally, this information may be stored in various different formats, such as documents, web pages, and other machine readable formats. Locating information relevant to a specific query posed by a user often requires specific knowledge of the information""s location, sophisticated search strategies and even professional researchers. The use of search engines to locate information related to a user""s query is well known and has to some extent sped up the process of locating related information.
A significant portion of related information returned by search engines may not be considered truly relevant to a user""s query. The resources required to evaluate all of the information identified by a search engine in order to filter out non-relevant information can be more than substantial. The resources used may include, by way of example and not limitation, transmission bandwidth, data storage, and time (both of system usage and of personnel) required to filter out related but not relevant information. The need to capture and organize relevant information can be overwhelming, and an automated system is required to effectively solve this problem.
In these respects, the method for automatic selection of databases according to the present invention substantially departs from the conventional concepts and designs of the prior art, and in so doing provides a system primarily developed for the purpose of improving the efficiency of data capture and management systems.
In view of the foregoing disadvantages inherent in the known types of search engines now present in the prior art, the present invention provides a new method for automatic selection of databases construction wherein the same can be utilized for improving the efficiency of data capture and management systems.
The invention contemplates a method of selection and characterization of search engines and databases which includes obtaining a candidate database listing providing a uniform resource locator (URL) for each one of a plurality of candidate databases to be considered during selection, obtaining a query from a user, matching a subset of candidate databases to said query, and storing a listing of selected databases to be used for retrieving information relative to said query.
There has thus been outlined, rather broadly, the more important features of the invention in order that the detailed description thereof that follows may be better understood, and in order that the present contribution to the art may be better appreciated. There are additional features of the invention that will be described hereinafter and which will form the subject matter of the claims appended hereto.
In this respect, before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not limited in its application to the details of construction and to the arrangements of the components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced and carried out in various ways. Also, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting.
As such, those skilled in the art will appreciate that the conception, upon which this disclosure is based, may readily be utilized as a basis for the designing of other structures, methods and systems for carrying out the several purposes of the present invention. It is important, therefore, that the claims be regarded as including such equivalent constructions insofar as they do not depart from the spirit and scope of the present invention.