1. Field of the Invention
The present invention relates generally to data storage systems, and more particularly to storage of data for transaction processing applications where one host processor has read-only access to a dataset concurrent with read-write access by another host processor.
2. Background Art
Transaction processing uses computer programming techniques that maintain database consistency under various conditions such as recovery from system failure or concurrent database access by multiple host processors. Database consistency is typically maintained by subdividing the application program of a host processor into a series of transactions. Each transaction includes a set of read-write instructions that change the database from one consistent state to another. The set of read-write instructions for each transaction is terminated by an instruction that specifies a transaction commit operation. During the execution of the transaction, the database may become inconsistent. For example, in an accounting application, a transaction may have the effect of transferring funds from a first account to a second account. The application program has a first read-write instruction that debits the first account by a certain amount, and a second read-write instruction that credits the second account by the same amount. Before and after the transaction, the database has consistent states, in which the total of the funds in two accounts is constant. In other words, the total of the funds in the two accounts at the beginning of the transaction is the same as the total at the end of the transaction. During the transaction, the database will have an inconsistent state, in which the total of the funds in the two accounts will not be the same as at the beginning or at the end of the transaction.
For concurrent database access by multiple host processors, it is conventional to permit only one host processor to have read-write access to the database, and to permit the other host processors to have read-only access to the database. For many applications, read-only access to the database must be restricted to consistent states of the database. For example, decision support systems analyze and evaluate data that is accumulated in the course of business transactions. In such a decision support system, one host processor may perform read-write transactions upon a database to produce a record of the business transactions, and any number of other host processors may perform read-only access to the database to analyze and evaluate the data in the database. The analysis and evaluation must be performed upon consistent states of the database. For an accounting application, for example, an evaluation of the total amount of funds in a set of accounts would be erroneous if the totals were not computed from consistent states of the database.
There are various ways of restricting read-only access to consistent states of a database in a multi-processor environment. A typical way is to restrict read-only access to a snapshot copy of the database. The snapshot copy of the database is updated at the end of each transaction, at the conclusion of the transaction commit operation. The snapshot copy can be maintained by a data storage system that provides read-write access to the database and concurrent read-only access to the snapshot copy of the database. The snapshot copy of the database can be updated very quickly at the conclusion of the transaction commit operation, so as to provide uninterrupted read-only access to consistent states of the database.
For some situations, there are difficulties in using a conventional snapshot copy facility for providing uninterrupted read-only access to consistent states of the database. For example, Raz et al., U.S. Pat. No. 5,852,715, incorporated herein by reference, discloses a multi-processor system in which a local data storage system provides read-write access to a database, and a remote data storage system provides read-only access to the database. A data communications link connects the remote data storage system to the local data storage system. The local data storage system stores a local copy of the database. The local copy of the database is mirrored over the data communications link to a remote copy of the database in the remote data storage system. At the remote data storage system, a support copy is derived from the remote database. The remote data storage system provides read-only access to the support copy to implement decision support functions. Raz et al. U.S. Pat. No. 5,852,715 discloses two ways of updating the remote copy. Changes made to the local database could be recorded in the remote database on an ongoing basis, in which case the support copy would be a snapshot of the remote database. Alternatively the remote copy could be a snapshot of the local copy. In either case, one could use a conventional remote mirroring facility and a conventional snapshot facility. However, it would appear that the use of such conventional facilities would require at least three versions of the database (local copy, snapshot copy, and mirrored copy), together with overhead for maintaining both the snapshot copy and the mirrored copy on an ongoing basis for concurrent and uninterrupted read-only access.
In accordance with a basic aspect of the invention, there is provided a method of operating a data storage system to provide uninterrupted read access to a consistent dataset concurrent with performing a series of revisions upon the dataset. The series of revisions includes a first set of revisions followed by a second set of revisions. The dataset is in a consistent state after performing each set of revisions upon the dataset. The method includes: (a) processing the first set of revisions to form a directory of the first set of revisions; and then (b) processing the second set of revisions to form a directory of the second set of revisions, and concurrently performing the first set of revisions upon the dataset, and concurrently performing read access to specified data in the dataset by accessing the directory of the first set of revisions to determine whether the specified data are in the first set of revisions, and upon finding that the specified data are in the first set of revisions, obtaining the specified data from the first set of revisions, and upon finding that the specified data are not in the first set of revisions, obtaining the specified data from the dataset.
In accordance with another aspect, the invention provides a method of read-write access by a first host processor to a dataset in a first data storage system concurrent with uninterrupted read-only access by a second host processor to a consistent state of a copy of the dataset in a second data storage system. Revisions to the dataset in the first data storage system from the read-write access are also made to the copy of the dataset in the second data storage system. The revisions include a first set of revisions followed by a second set of revisions. The dataset is in a consistent state after performing each set of revisions upon the dataset. The method includes: (a) processing the first set of revisions to form a directory of the first set of revisions; and then (b) processing the second set of revisions to form a second directory of dataset revisions in the second set of revisions, and concurrently performing the first set of revisions upon the copy of the dataset, and concurrently performing read access on a priority basis to specified data in the dataset by accessing the first directory of dataset revisions to determine whether the specified data are in the first set of revisions, and upon finding that the specified data are in the first set of revisions, obtaining the specified data from the first set of revisions, and upon finding that the specified data are not in the first set of revisions, obtaining the data from the copy of the dataset.
In accordance with yet another aspect, the invention provides a data storage system including data storage, and a storage controller responsive to read and write commands for accessing specified data of a dataset in the data storage. Each set of write commands modifies the dataset from one consistent state to another. The storage controller is programmed to respond to each set of write commands by first operating upon revisions of each set of write commands in a write-selected phase and then operating upon the revisions of each set of write commands in a read-selected phase. The storage controller forms a directory of the revisions of each set of write commands in the write-selected phase. The storage controller accesses the directory of the revisions of each set of write commands in the read-selected phase. The storage controller performs the revisions of each set of write commands in the read-selected phase upon the dataset, and concurrently responds to the read commands on a priority basis by accessing the directory of the revisions of said each set of write commands in the read-selected phase to obtain specified data from the revisions of each set of write commands in the read-selected phase when the specified data are in the revisions of each set of write commands in the read-selected phase, and when the specified data are not in the revisions of each set of write commands in the read-selected phase, obtaining the specified data from the dataset.
In accordance with still another aspect, the invention provides a data storage system including data storage, and a storage controller responsive to read and write commands for accessing specified data of a dataset in the data storage. The storage controller is programmed to respond to transaction commit commands by alternately writing to a first volume of the data storage and to a second volume of the data storage sets of revisions made to the dataset by the write commands. Each set of revisions to the dataset includes revisions from a set of transactions defined by the transaction commit commands so that each set of revisions changes the dataset from one consistent state to another. In addition, the storage controller is programmed with a remote mirroring facility for mirroring the first and second volumes to corresponding volumes in a remote data storage system.
In accordance with a final aspect, the invention provides a program storage device containing a program for a storage controller of a data storage system. The program is executable by the storage controller for responding to read and write commands for accessing specified data of a dataset in data storage of the data storage system. Each set of write commands modifies the dataset from one consistent state to another. The program is executable by the storage controller for responding to each set of write commands by first operating upon revisions of each set of write commands in a write-selected phase and then operating upon the revisions of each set of write commands in a read-selected phase. The storage controller forms a directory of the revisions of each set of write commands in the write-selected phase, and the storage controller accesses the directory of the revisions of each set of write commands in the read-selected phase. The storage controller performs the revisions of each set of write commands in the read-selected phase upon the dataset, and concurrently responds to the read commands on a priority basis by accessing the directory of the revisions of each set of write commands in the read-selected phase to obtain specified data from the revisions of each set of write commands in the read-selected phase when the specified data are in the revisions of each set of write commands in the read-selected phase, and when the specified data are not in the revisions of each set of write commands in the read-selected phase, obtaining the specified data from the dataset.