The present invention describes a method of storage management in document databases.
In banks and insurance companies, thousands of documents are handled daily. In the case of banks, transfers and account withdrawal documents are most common; with regard to insurance companies, it is principally policies or claims. To be able to access and search these documents, they are usually filed in databases. The period in which these documents have to be accessible in the databases is determined chiefly by the type of the documents. As well as purely operational requirements, statutory regulations also exist regarding the period for which documents are kept. Since the quantity of data (quantity of documents) e.g. in banks and insurance companies, is constantly increasing over time and since existing databases cannot manage these increasing quantities of data, organisation into logical storage areas has become established as a leading concept for document management. This concept specifies that the entire logical storage area available (which may also extend over several systems) is divided up into storage segments, each individual storage segment only being able to hold one type of document defined by the user. If the relevant storage segment is full, this segment is closed for the receipt of further documents and a new segment is created for this type of document (see FIG. 1). This concept is described in greater detail in the published PCT application WO 97/16794 of the applicant.
In theory, an unlimited number of documents can be filed by means of this concept.
Normally, the documents put in the segments are filed first on the fast hard disk (part of the logical storage area). Owing to their speed, hard disks of this kind are relatively expensive. Modern computer systems therefore feature storage management systems (SMS), such as DFSMS by IBM, for example, which relocate files which have not been used for a fairly long time to slower and less expensive storage media, e.g. data tapes. If one of the relocated documents is to be accessed again, this is ascertained by the SMS, and it makes sure that the data are transferred from the data tapes back to the fast hard disks (recall). A process of this kind takes a relatively long time. It may take a few seconds to several minutes.
The document management systems currently available on the market, e.g. VisualInfo or OnDemand (R/DARS), manage a restricted storage space, similar to the situation with the main storage space in a computer system. This means that all documents are first collected in a database (document pool). Once this segment is full, either completely or to a certain degree, the segment is relocated to another storage medium, e.g. magnetic tapes or CDs.
Information on which elements have been relocated and which segments they are in is managed by the pertinent document management system (DMS). If a document is sought in the DMS, this establishes whether it is in the segment which it can access directly. If not, it looks in its management information to see which relocated, packed segment the document is in and procures this document. This packed segment is then read into a temporary storage space and the document sought is procured. If another document is then sought, the temporary storage space is released and occupied by the new documents. A temporary storage space of this kind may be a database, as in the case of OnDemand, or a CD drive, as for VisualInfo.
Documents which have already been found are then always deleted from the temporary storage space if other documents which have been relocated are sought.
The disadvantage of these systems is that no special access features to the document are taken into account. The documents are normally stored in order to be able perhaps to find them again one day and reproduce them. However, in this regard certain documents have different access periods and access probabilities, e.g. bank transfers are searched for normally for a maximum of 6 weeks, as banks have a corresponding appeal period for customers. For other documents, such as loan extensions, which sometimes run for months and are being constantly added to, searches for these documents are conducted in these time spans. Since the recall of relocated documents takes a relatively long time compared with accessing documents which are still in a segment, it is a disadvantage for all documents to be treated in the same way on relocation. The disadvantage here is that all documents are treated in the same way and that the criterion for relocation is the size and availability of an existing, equally large space.
It is similarly the case for documents which, following relocation, have been recalled. Some documents only need to remain stored for a short period, while in the case of other documents longer periods are required in which they should be available on-line. This is due to the fact that they will in all probability be re-accessed in a certain time period.
In the previous example of the bank, it is usually the case that transfers are only recalled briefly in order to reproduce them so that they can be used as evidence in relation to customers. In contrast, processes normally take longer in the case of loan applications. One has in mind here loan extensions which, with corresponding negotiations over interest rates and possible counter-offers obtained by a customer from competitors, normally extend over several weeks.
The object of the present invention is to provide a method of document management which avoids the aforementioned disadvantages.
This invention is achieved by the features of claim 1. Further advantageous forms of execution of the invention are set out in the sub-claims.
The essential advantages of the invention lie in the fact that the documents are found faster. Time and money are saved hereby. Lower-cost storage means can be used due to the relocation of documents. The total quantity of expensive storage means is hereby reduced. The residence period of the document in a storage means can be determined individually for each document by the user. However, the residence period can also be determined by a program, which calculates the access frequency for documents or document types and establishes the residence period according to this.