1. Field of the Invention
The field of the present invention relates to a process for the storage and retrieval of data on a tape system.
2. Description of the Prior Art
Such tape systems generally comprise a tape drive with on-line storage and retrieval and off-line retrieval. Such a system is described in our U.K. Patent No. 2,285,525, which describes the process being carried out by a database controller, a data processor and a tape system controller. Essentially in this system there is set up a database structure including a single volume-type dataset for a plurality of data objects containing one or more variable length logical records. When a data storage request is received from a data processor it carries out the steps of assigning a single tape volume to the relevant dataset, directing creation of a new record header within a data block of the tape volume; the header including a primary key and a storage date and writing the data to a plurality of sub-records in sequence after the record header. Finally it sets up an index file for the database comprising a single index control record with tape system initialisation data.
This system has been extremely successful in that it allows for the retrieval of data from tapes which appears to be at least to the user an "on-line" system. The invention of this U.K. Patent Specification No. 2,285,525 enabled the use of tape storage systems for what are effectively on-line processing situations. However, since this invention was first introduced into the market place there has been an enormous, indeed an exponential, increase in the amount of data that is now being stored in database systems. The need to facilitate the archival of inactive data-to-tape storage and to enable the retrieval of this archived data in a batch or on-line processing environment has increased enormously. For many reasons the amount of data now being archived is increasing enormously and at the same time the need for increased and efficient retrieval of that data is becoming more urgent. The problem is that there is a need to have archived database material processed so that it appears to be fully integrated into existing or planned batch or on-line applications.
With the increased volume of data being stored, it is becoming more and more difficult to achieve an apparent on-line processing environment than heretofore.
As has already been described in U.K. Patent Specification No. 2,285,525, data is held in accordance with the present invention as a series of objects which may consist of one or more variable-length logical records, each record containing up to 32,760 bytes in one particular system. There is no limit to the number of logical records in each object. During the data archival procedure, each logical record within an object is passed sequentially for insertion into the database and during object retrieval each of its component logical records will be separately identified to the retrieval batch or on-line application. This has been achieved by ensuring that an object is identified by an unique combination of primary key and archived data. Multiple objects with the same primary key may exist in the database, but the archive date for each of these objects will be different. Further in accordance with the invention as previously described an object may be indexed by one or more secondary keys using a secondary indexing facility. These are created when an object is written to the database. The primary index entry is always created and secondary index entries are generated depending on what is required, either because it is automatically generated or by explicit request on archiving.
In essence the database consists of various components namely one or more single volume tape datasets containing archived objects, a primary index dataset, space management dataset and journal dataset if an audit trail is required and various back-up or secondary datasets.
Because of the enormous increase in the amount of information being stored and also because of the greatly increased volume of data that can now be stored on the one tape contained in a tape cartridge, there is a need for an improved management of the data on the tape to optimise access to the data. The problem is that as the number of silos for storing tapes off-line increases, the recycling time is taking even longer than heretofore. This is in spite of increased and more efficient tape handling facilities.
Recently there have been developments in the storage and retrieval of data on a tape system. The first development has been that more and more the service is being provided by a host database system or a number of clients. Alternatively, if the host database system is not being operated for a number of clients and indeed it is being operated by the one company for the one organisation, there are often what are in effect a number of clients within the organisation. For example, queries from portions of the organisation answering customer queries should be dealt with more quickly than requests for information for internal purposes. Thus, in effect, each organisation consists of a number of clients with different needs and indeed for the people operating the tape system with different priorities allocated to their needs. Further, when clients of a host tape system are themselves making requests to the host tape system it is necessary for them to schedule and prioritise their own internal requests. This problem has been exacerbated by what can only be described as the explosion of Windows NT or OS/2 workstations within organisations. There is thus a need for the tape system to be able to service requests from other applications running on workstations for access and retrieval of data archived in the tape system. For example, it is possible that multiple clients may be concurrently connected to the same tape system and they in turn may execute request on different client workstations, so that in turn at the host end it is important to be able to handle the multiple requests from each connected client.
The term `storage level` is used to refer to the priority of storage for the particular data object. For example, the highest level of storage or the highest storage level will be allocated to those data objects that are likely to require retrieval on a regular basis that is to say having the highest level of activity. The storage level of a data object will, very often, be determined either by, for example, a particular customer, the nature of the data, or indeed, commonly the time for which the information has been stored.