The present invention relates to a data management system and method in a storage subsystem and a computer system. In particular, the invention relates to a snapshot management method of an external storage unit necessary for the backup of the data of the external storage unit for which high availability is required. It also relates to a computer system having a storage subsystem controlled by the above management method.
Generally, in computer systems backup copies of data are made periodically on other recording medium such as magnetic tape to provide reliability in case of loss of the data recorded in storage units caused by equipment failure, natural disaster, a software error, incorrect operation, etc. If the data in the storage unit is lost, the original data can be recovered using the back up data. When acquiring a backup, if the data is updated during a copying operation to acquire the backup, inconsistency of the copied data will be caused. Therefore, prevention of data update during the copying operation must be assured.
To avoid corrupting the data being copied for the backup operation, suspension of operation of all the programs except the backup program which accesses the data suffices. In a system required for high availability, however, the program cannot be suspended for a long time. Therefore, a mechanism which allows updating the data during backup operations, and records the state of the data at the time of the start of the backup is necessary.
The fixed image of data at a certain time point is known as a snapshot. A mechanism which allows the data to be updated while maintaining the snapshot is called a snapshot management method. Taking a snapshot using the snapshot management method is called an acquisition of snapshot. The object data from which the snapshot is taken is called original data. A conventional snapshot management method is realized by storing the pre-updated data, duplicated using the computer or an external storage unit. One approach to conventional snapshot management is as follows.
(1) Method of Storing the Pre-updated Data
U.S. Pat. No. 5,649,152 describes one method of snapshot management. In this technique, if the update of the original data occurs after the time of acquisition of the snapshot, the memory contents before the update are stored in a different memory area. The snapshot is then logically accessed as different data independent of the original data. After the acquisition of the snapshot data, the snapshot shares the memory area with the original data as to the part of the original data not updated. For the updated original data, the contents of the memory before the update are stored in another area.
(2) Method of Duplicating Data by a Computer
In another method of duplicating data by a computer, a program on the computer stores all the data in two areas duplicating (mirroring) the data. When acquiring the snapshot, the program stops the duplicating operation, separating the two memory areas to independent regions, and provides one area as the original data and another as the snapshot. Such method of snapshot management is disclosed, for example, by U.S. Pat. No. 5,051,887.
(3) Method by Means of Duplicating Data by an External Storage Unit
As disclosed in U.S. Pat. No. 5,845,295, for example, snapshot management is executed using the procedure of (2) above inside the external storage unit instead of using the program in the computer. The entire function for managing the snapshot is provided by the snapshot management program inside the external storage unit.
In the prior art above, by means of the storage of pre-updated data, accesses to the snapshot and to the original data both refer to the same memory area with respect to the original non-updated data. Therefore, the accesses are concentrated at specific recording media, and the input/output performance of the disk drive is diminished. In this technology, both the accesses to the original data and to the snapshot are done via the snapshot management program. Therefore, during the backup, the load on the computer executing the snapshot management program increases. This may impact the execution speed of other programs, such as database programs. In backing up a large amount of data, the performance deteriorates for a long time until the backup operation is completed.
In one of the methods, the snapshot management program writes data on two logical storage units. Thus twice as many executions of the write operation as those of the system without the snapshot management are required. Therefore, in the prior art, the load on a CPU of the computer executing the data writing, the amount of data communication through communication paths connecting the computer and the external storage units, and the load on the disk controller of the external storage units increase. For this reason, the execution speed of each application program is slower compared to that in a system without snapshot management. The deterioration of performance is conspicuous with the processes involving an update of a large amount of data, for example, replication of a database.
In the method where the external storage unit duplicates the data, all the processing necessary for the snapshot management is installed in the external storage unit, so that the control programs of the external storage unit become complicated. The period required for development of the control program becomes longer, and thus its complexity, debugging and cost rise.
The present invention solves such problems of the prior art and provides a snapshot management method which prevents concentration of the load on specific memory media of the external storage unit and prevents the increase of load on the CPU in which the snapshot of the external storage has been acquired. Furthermore, it alleviates the load on the CPU of the computer, the amount of communication through the communication paths, and the load on the external storage unit in the mode in which the snapshot of the external storage has not been acquired.
Another benefit of the present invention is to provide a computer system which is able to execute snapshot management of the external storage device using an inexpensive external storage unit having comparatively simple functions.
The present invention provides a method for managing the snapshot data in a computer system having a computer and a storage unit system connected to it and the computer system to which the method is applied. The storage unit system is equipped with the first and the second storage units which are to be duplicated. In the state in which they are duplicated, when a data update request is issued to the first storage unit, the storage unit system also writes the same updated data on the second storage unit. When acquisition of the snapshot is requested from the computer, the storage unit system interrupts writing data to be written on the first storage unit on the second storage unit. When writing data on the first storage unit occurs, the computer holds the writing location as differential information. When making the contents of the first storage unit and the second storage unit the same, the data written on the first storage unit after the acquisition of the snapshot was requested is written on the second storage unit according to the differential information held in the computer. In one form of the present invention, the processing to make the contents of the first storage unit and those of the second storage unit the same is executed in the way that the computer reads the data from the first storage unit according to the differential information, and then instructs the storage unit system to write the data. In another form of the present invention, the processing to make the contents of the first storage unit and those of the second storage unit is executed in the way that the computer acquires the storage locations of the data to be made to be the same according to the differential information, and then instructs the storage unit to copy the data specifying the storage locations. In response to the copying instruction, the storage unit system reads out the data from the specified storage locations in the first storage unit and copies the data to the second storage unit.