1. Field of the Invention
This invention relates to information storage and retrieval and more particularly to methods of automatically abstracting, storing and retrieving documents using free form inquiry.
2. Description of the Prior Art
In implementing a document storage and retrieval system, the practicality and utility of such a facility is governed by the ease that respective documents are catologed into the system and the efficiency with which a user's request can be associated with the related document catalog representation (description). State of the art document storage and retrieval is based on manually selecting keywords to represent a document in the system's catalog or index and then effecting retrieval by recalling from memory appropriate keyword terms and either automatically or manually searching the index for an "appropriate" level of match against the prestored keywords. Procedures have been developed in the prior art for abstracting documents and retrieving them based on keyword matching. One of the procedures requires the requestor to supply in a fixed format certain details about the subject document such as: author, addressee, date and keywords or phrases. For retrieval, a summary sorted listing is prepared under each of the above headings. The requestor must discern the appropriate document by examining the entries under the retrieval information headings. No latitude is allowed in the search clues. The search may be done by manual perusal or using data processing global find commands.
A second procedure stores all non-trivial words (i.e., ignores articles and pronouns, etc.) in a document as a totally inverted file. The document/line/word position of origin is maintained in the catalog. Search of the database for retrieval is effected by the user supplying keywords based on the user's memory. The catalog is automatically searched with the added facility that the user can specify relations that must exist between the keywords as they exist in the original text (i.e., keyword 1 is before keyword 2, etc.). An example of such a system is the IBM Data Processing Division product Storage and Information Retrieval System, commonly called STAIRS.
A third method for document storage and retrieval is simply storing the document in machine readable form and searching all documents using a "global find" logic for each user supplied keyword. In theory and in practice for small data bases, the "global find" can be replaced by the user reviewing the documents verbatim as they are displayed on a CRT type device.
However, in all the above procedures for document storage and retrieval, the major intelligent burden for abstraction and retrieval association matching is put on the user. Where the system aids in abstraction or matching, it is done at the cost of voluminous cataloging procedures, massive data processing burden and a structure format is required for the user to communicate for retrieval with the system.