The present invention relates to methods and systems for processing information in a data processing system. In particular, the invention relates to methods and apparatuses for searching for information stored in information storage devices coupled to at least one data processing system.
The process of searching through a large volume of documents which contain text in order to find a particular document or documents is often a very useful technique for obtaining information. Typically, the text of these documents is stored in electronic media in an information storage device (for example, magnetic media in a device such as a hard disk or an optical medium) which is coupled to a data processing system, such as a digital computer. It is often the case that an enormous volume of text is stored in electronic form in such a storage device. For example, a large number of U.S. patents are maintained in electronic form by various entities. Similarly, the full text of numerous periodicals, including newspapers, is often stored in information storage devices in the form of a database or other file, and users often want to search these databases or files to find articles, documents, etc. that are of interest to the user.
At times, the information being searched may reside locally on the computer system which is being used by the user; for example, text in electronic form from numerous sources such as articles from newspapers may be stored on a hard disk of the user's computer system and may be searched by commercially available full text searching software such as Gofer (TM), Sonar (TM), and ZYINDEX (TM). Unfortunately, the source of information may be so large that it cannot fit within a typical hard disk or other storage device of a typical personal computer. In this case, it is often necessary, due to the economics of computing resources, to spread the cost of large information storage devices among numerous users which are linked together by a computer network, such as a local area network. A well known example of a computerized network which includes information storage devices capable of storing large quantities of information is the Lexis/Nexis (TM) system run by Mead Data. In this case, it will be appreciated that this "network" is considerably larger than a normal local area network.
In prior art systems for searching for text information in a data processing system, the user may enter a single search request and then request either the local processor (e.g. the client workstation) or a remote processor (e.g. a server workstation) to execute the search request by performing a search through the information stored in an information storage device for documents which match the search request. While the search is being executed, it is not possible for the user to concurrently enter a further search request or to cause that further search request to be executed concurrently with the first search. Consequently, the user must wait after requesting execution of the first search request before entering a further search request and causing that further search request to be executed. While this is often acceptable in environments where all of the processing occurs on a local workstation (e.g. a personal computer), this situation is particularly inefficient in a network environment. In this environment, servers may be called upon by a number of different users from different client workstations to execute different tasks or perform other tasks such as preparation of new documents (e.g. indexing existing documents), and thus a server would not be available to process a search request. Consequently a client user would be prohibited from even entering a second search request until the server has had an opportunity to execute the first search request after handling other prior tasks from other users in a network. Wide area networks (with interconnected local area networks) pose an even greater problem in the sense that the gateways and routers interconnecting local area networks may be busy with other transactions, and thus a user and his/her machine may be prevented from any other searching activities while a first search request is being processed through a first search.
In many information sources, such as databases, there is often a need to add new documents which have come into existence after the creation of the database, or add modified documents which have been modified since the creation of the database or information source. For example, a textual database containing articles from newspapers will need to be periodically updated with subsequently released newspapers in order to keep the database current with the current contents of the newspaper. Similarly, if the information source is a collection of U.S. patents, then the information source will need to be updated with U.S. patents which issued subsequent to the last date on which the database was modified to include newly issued U.S. patents. In prior art systems, a user would normally define a search request at one point in time and then have to repeat that search request at a later time by manually entering the search again in order to see if any new documents which have been added to the database since the last search. In such prior art systems the manual entry of a subsequent search request (or retrieval of a saved search request to be executed again) will result in the generation of a report which is a listing of documents found in the search, where the format of the report is identical to the format used in responding to a normal search request. Even systems which execute automatic future searches (e.g. the "Eclipse" feature in Lexis) do not generate specially formatted reports. That is, the response of the data processing system to a subsequent search request will be identical in format to the response from the search request when previously executed. No special effort is taken to display the information to the user in a manner which is helpful in evaluating updated information available from the information source since a prior search. Indeed, in many systems, the report of a subsequent search report will include the results of a prior search report and thus there can be considerable duplication between an original report from a first search and a subsequent report in a subsequent search.
In these prior art systems which utilize information sources which change over time, it is often necessary to perform "maintenance" on the information source. This maintenance typically includes adding additional documents or removing documents as well as indexing new documents or compressing/compacting indexes which have been changed due to the removal of documents from the database. This "maintenance" is typically performed in a network of computer systems where one computer system, referred to as a server workstation or computer system typically controls access to information sources by other computer systems in the network such as those systems referred to as "client" computer systems. In these environments, a user at a client workstation is often presented with a list of available information sources even though a particular information source is unavailable or is undergoing maintenance by the operator of the server workstation. In this circumstance, the results of any search performed may be erroneous; for example, the user of the client system may believe that in fact he/she is searching an information source when in fact it is not available.
From the foregoing discussion, it can be seen that it is desirable to allow a user of a computer system to be able to execute further searches after requesting a first search, particularly in a network environment. It is also desirable to present to the user data, in a summary format, showing a report of a scheduled search, particularly one which has been scheduled to occur automatically by the user, in order to improve the user's efficiency in evaluating the search results from a scheduled search. It is also desirable to allow a user on a client computer system in a network to obtain accurate information about the availability of information sources while also allowing the operator of a server computer system to maintain the information sources and also provide accurate information to users of client systems about the availability of information sources.