This invention is directed to information storage and retrieval, and more particularly to a system for storage and retrieval of large quantities of documents which may include text, illustrations or combinations thereof. The invention is particularly useful in the archival storage of historical documents wherein it is desirable to maintain the integrity of the historical document, including its original appearance. As used herein and in the appended claims, the term "archival document shall be used to refer to a document containing textual information but wherein the appearance of the original document, and not merely the content of the text, is significant, and an "archival document storage system" or "archival document image storage and retrieval system" shall be used to refer to a system for storing and retrieving images of archival documents where the appearance of the original document is of interest to the user of the system.
In a conventional archival storage system, documents would typically be separated into files. In some cases, large quantities of documents would be stored with no practical indexing, so that reasonable access to the documents would not be available to any one but a few scholars who knew where to find them. An indexing system could be manually generated and researchers could access the stored documents through an index card file, but manual indexing systems have not proven entirely satisfactory.
When using an index card file system, it is still necessary to retrieve a document from the storage files in order to determine if it is relevant. If relevant, it is then necessary to obtain a photocopy or other reproduction of the document. These processes can take considerable time where a large number of documents are involved.
The handling of the documents contributes to the deterioration of the documents, which can be a long term problem in an archival storage system.
Still further, the complexity of manually generating an index card file system can itself be a disincentive for maintaining such a system when extremely large numbers of documents are involved.
It is desirable to provide some type of automated search capability, and it is known in some systems to index documents, e.g., by key words, and to permit automated searching. However, this facilitates only the searching aspect of the conventional system described above, and it is necessary to manually retrieve documents, to take the document to a photocopy station to obtain a copy of the document, and to manually generate the key words and phrases which will be used in the indexing system.
In at least one publicly available storage and retrieval system, i.e., the automated search system currently maintained by the U.S. Patent and Trademark Office for searching U.S. patents related to data processing, pertinent portions of the documents covered by the data base are stored on microfiche. Each document must be read by Patent Office personnel who will then assign that document to one or more descriptive headings. A system user can then key in a particular heading or a plurality of headings combined with logical operators, and the system will display from microfiche the stored portions of every document satisfying the search request. While such a system represents a substantial improvement over entirely manual systems, it is still not entirely satisfactory in a document storage and retrieval system employing very large numbers of documents, e.g., many millions of pages of text and drawing. The microfiche storage capacity is insufficient for such large numbers of documents, and the speed of retrieving the appropriate microfiche for display would also be unsatisfactory in a system of great size. Further, the pages of documentation are recorded on microfiche by a conventional photographic process, and there is no opportunity for the system to recognize the content of the documents being photographed. All key words and descriptive headings must therefore be manually entered. Still further, it is necessary for each viewing station to have its own set of microfiche, or at least for all viewing stations to be located immediately adjacent the microfiche file.