1. Field of the Invention
The present invention relates to duplication of data in a storage system such as a disk array subsystem. Particularly, it relates to a data duplication device system which can minimize access performance delay in a master volume caused by copy-on-write action when using snapshot technique, and minimize an influence of accesses, which are made by a pair of volumes in snapshot relation, to be imposed upon the access performance of each other.
2. Description of the Related Art
Recently, in accordance with more and more introduction of IT (Information Technology) into the social infrastructure, the data quantities held by companies or individuals have been dramatically increasing. Further, quality and value of the data itself has been increased due to spread of electronic commercial transaction, legalization of keeping data for authentication, etc. Under such conditions, an influence of data loss has been widely recognized, and backup technique as a way to prevent data loss in advance has been attracting great attentions.
A backup procedure of the mirroring method in a conventional disk array subsystem will be described. First, application such as a database that is making access to a master volume is stopped for securing a quiescent point of the master volume to be a target of backup. Then, a backup volume with the same capacity as that of the master volume will be generated for copying the entire data of the master volume to the backup volume. Upon completing the copy, the stopped application such as the database is restarted. At the same time, the data is read out from the backup volume and the read-out data is saved in a backup device such as a tape.
In this procedure, application is stopped once, and then, the backup volume is generated and the data is copied from the master volume to the backup volume. There is also another backup method where the backup volume is generated while the application is in action and the data is copied from the master volume to the backup volume for shortening the time between the point of securing the quiescent point and the point of synchronizing the master volume and the backup volume.
However, both methods require time for backup in accordance with the data quantity to be backed up from the point of securing the quiescent point of backup until completely copying the data of the master volume to the backup volume. Further, there often happens that a plurality of backup volumes are generated for the same master volume to be used for other than backup, such as data mining. In such case, there is consumed the data capacity that is several times the master volume.
In order to avoid such issues of the mirroring method, latterly, backup employing snapshot method has been used frequently (see, for example, Japanese Unexamined Patent Publication 2004-192133). FIG. 17 is a conceptual diagram for illustrating the operation principle of typical snapshot. Description thereof will be provided hereinafter by referring to this drawing.
The snapshot is a technique for keeping the snapshot-target volume to stay in the state (i.e. image) of the designated point. Specifically, assuming that snapshot of the snapshot-target volume, to which the data as shown in FIG. 17A is stored, for example, is taken, first, a snapshot duplication volume having the memory capacity equivalent to that of the snapshot-target volume is generated within a storage system in the manner as shown in FIG. 17A.
At a stage where the data “EE” is to be written to the storage area of the data “BB” of the snapshot-target volume, the data “BB” before update is stored in the same address as that of the snapshot duplication volume as shown in FIG. 17B. The snapshot duplication volume functions as a first-generation snapshot.
If a snapshot of the snapshot-target volume is taken again here, a snapshot duplication volume with the memory capacity equivalent to that of the snapshot-target volume is generated anew within the storage system as shown in FIG. 17B, and the snapshot duplication volume functions as a second-generation snapshot.
When the data “FF” is to be written further to the storage area of the data “CC” of the snapshot-target volume, the data “CC” before update is stored in the same address as that of the snapshot duplication volume that functions as the second-generation snapshot in the manner as shown in FIG. 17C, while keeping the contents of the snapshot duplication volume that functions as the first-generation snapshot as in FIG. 17B or FIG. 17C.
There has been simply described regarding the series-type snapshot by referring to FIG. 17. In the case where the parallel-type snapshot is applied, however, the data “CC” is also stored in the first-generation snapshot in addition to the second-generation snapshot when writing the data “FF” in the storage area of the data “CC” of the snapshot-target volume as shown in FIG. 17C.
In the volume duplication method utilizing the snapshot technique, it only needs to define the volume for holding only the update data of the snapshot-target volume immediately after the disk array subsystem receives a snapshot command. Thus, it looks on the appearance that the backup-target volume can be duplicated instantly.
Next, by referring to FIG. 11-FIG. 13, there will be described an example of the specific constitution of a conventional data duplication apparatus which employs the snapshot method.
As shown in FIG. 11, a conventional disk array subsystem 100 has a constitution which comprises: a master volume 101, a virtual volume (referred to as “snapshot volume” hereinafter) 102 with the same capacity as that of the master volume 101 that actually has no physical capacity, and a volume (referred to as a “common volume” hereinafter) 103 for storing the data of the snapshot volume 102 secured in a storage area. Further, it is provided with a data duplication control device 104 for managing data access to the snapshot volume 102, and an address conversion device 105 for managing an actual storage target of the duplication data.
As shown in FIG. 12, the data duplication control device 104 comprises: a property managing table 121 for managing volume properties of the master volume, the snapshot volume, the common volume, etc.; a volume correspondence managing table 122 that holds the snapshot relations between the volumes; and a difference managing table 123 for managing the difference between the master volume 101 and the snapshot volume 102.
As shown in FIG. 13, the address conversion device 105 comprises a directory 131 that holds actual storage address of the snapshot volume 102, and an allotment managing table 132 for managing the availability of the common volume 103.
Next, action of the conventional disk array subsystem shown in FIG. 11 will be described by referring to FIG. 14-FIG. 16.
As the procedure for duplicating the volume, in step 140 of FIG. 14, first, the disk array subsystem 100 shown in FIG. 11 generates the common volume 103 within the storage area and, in accordance with this, the data duplication control device 104 initializes the property managing table 121 and the allotment managing table 132. That is, the data duplication control device 104 shown in FIG. 12 sets the common property in the property that corresponds to “LV2” of “LV No.” in the table 121. Further, the address conversion device 105 shown in FIG. 13 sets the value “0” in LV2 of the allotment managing table 132 for indicating that the common volume 103 is not used.
Subsequently, in step 141 of FIG. 14, the disk array subsystem 100 generates the master volume 101 and the snapshot volume 102 with the same memory capacity as that of the master volume 101 within the storage area. FIG. 11 shows the state within the storage area of the disk array subsystem 100 at the point where the processing of the step 140 and the step 141 shown in FIG. 14 is completed.
When the processing of the disk array subsystem 100 has proceeded to the step 141 shown in FIG. 14, and the disk array subsystem 100 receives a snapshot command in the step 142 of FIG. 14, the data duplication control device 104 initializes the property managing table 121, the volume correspondence managing table 122 and the difference managing table 123 according to the command (step 143). That is, as shown in FIG. 12, the data duplication control device 104 sets the master property in “LV0” of the property managing table 121 and the snapshot property in “LV1”. As shown in FIG. 12, the data duplication control device 104 sets “LV1” in “LV0” of the snapshot in the volume correspondence managing table 122 while setting “LV0” in “LV1” of the snapshot, in order to record in the volume correspondence managing table 122 that the “LV0” and “LV1” are in the snapshot-relation. As shown in FIG. 12, the data duplication control device 104, for corresponding to “LV1” of the snapshot volume 102 in FIG. 11, sets the value “0” in “LV1” of the difference managing table 123 for indicating that the snapshot volume shown in FIG. 11 does not hold data.
Further, in step 143 of FIG. 14, the address conversion device 105 sets in the directory 131, as shown in FIG. 13, “null” value for indicating that a storage space of the common volume 103 shown in FIG. 11 is not allotted to the snapshot volume 102.
Now, by referring to FIG. 15, there will be described the procedure of the processing when the disk array subsystem 100 receives a write command under the state where the processing has proceeded to the step 143 of FIG. 14.
When the disk array subsystem 100 receives the write command in FIG. 15, the data duplication control device 104 shown in FIG. 12 refers to the property managing table 121 in step 150 of FIG. 15 according to the write command. Then, in step 151 of FIG. 15, the data duplication control device 104 judges whether the received command is a command for the master volume 101 or for a command for the snapshot volume 102.
When it is judged as the write command for the snapshot volume 102, the data duplication control device 104 ends the processing without writing the data. This is the processing in accordance with the operation to keep the duplication of the master volume at the point where the snapshot volume 102 receives the snapshot command, e.g. processing for the case of backup, etc. In other operation forms, data may be written to the snapshot volume 102 as requested by the write command.
When it is judged that the command is the write command for the master volume 101, the data duplication control device 104, in step 152 of FIG. 15, specifies the snapshot volume 102 as a pair of the master volume 101 by referring to the volume correspondence managing table 122.
After specifying the snapshot volume 102, the data duplication control device 104 judges in step 153 of FIG. 15 whether or not there is data in the writing request address of the specified snapshot volume 102 based on the difference managing table 123 shown in FIG. 12.
When it is judged that there is the data in the snapshot volume 102 in step 154, the data duplication control device 104 advances the processing to step 159 of FIG. 15 for writing the data to the master volume 101, and ends the processing.
When it is judged in step 154 that there is no data in the snapshot volume 102, the data duplication control device 104 outputs a signal of the judgment result to the address conversion device 105. Upon receiving the signal from the data duplication control device 104, the address conversion device 105 shown in FIG. 13 searches the allotment managing table 132 shown in FIG. 13B, and determines the area to be used this time among the unused area of the common volume 103 (step 155 of FIG. 15).
In step 156 of FIG. 15, the address conversion device 105 copies the existing data currently present in the writing request address of the master volume 101 to the unused area of the common volume 103 shown in FIG. 11. Then, in step 157 of FIG. 15, the address conversion device 105 sets “1”, the value indicating that it is being used, in a corresponding column of the allotment managing table 132 as shown in FIG. 13, and sets the address of the unused area in the corresponding column of the directory 131. Subsequently, upon receiving the information from the address conversion device 105, the data duplication control device 104 sets “1”, the value indicating that there is data, in the corresponding column of the difference managing table 123 shown in FIG. 12 in step 158 of FIG. 15, writes the data to the master volume 101 (step 159), and ends the processing.
Next, by referring to FIG. 16, there will be described the processing procedure in the case where the disk array subsystem 100 receives a read command.
Referring to FIG. 16, when the disk array subsystem 100 receives a read command, the data duplication control device 104 shown in FIG. 12 refers to the property managing table 121 shown in FIG. 12 in step 160 of FIG. 16 according to the read command.
In step 161 of FIG. 16, the data duplication control device 104 judges whether the received command is a command for the master volume 101 or a command for the snapshot volume 102.
When it is judged that the command is for the master volume 101, the data duplication control device 104 shifts the processing to step 166 for reading out the data from the master volume 101, and ends the processing.
When the data duplication control device 104 judges in the step 161 of FIG. 16 that the received command is for the snapshot volume 102, the data duplication control device 104 refers to the difference managing table 123 shown in FIG. 12 in step 162 of FIG. 16, and judges whether or not there is data in the reading-out requested address of the snapshot volume 102 in step 163 of FIG. 16.
When the data duplication control device 104 judges that there is the data in the snapshot volume 102, the address conversion device 105, upon receiving the judgment result from the data duplication control device 104, refers to the directory 131 of FIG. 13 in step 165 of FIG. 16 for obtaining the address on the common volume 103 to which the data is stored. Upon receiving the address information from the address conversion device 105, the data duplication control device 104 reads out the data from the common volume 103, and ends the processing.
When the data duplication control device 104 judges that there is no data in the snapshot volume 102, the data duplication control device 104 refers to the volume correspondence managing table 122 (step 164 of FIG. 16), reads out the data from the master volume 101 as a pair of the snapshot volume 102 (step 166 of FIG. 16), and ends the processing.
As described above, in the snapshot method, when there is a write command issued to the master volume, it is judged, prior to the writing processing on the master volume, whether or not there is data in the snapshot volume stored in a page designated by the write command. When the data is not stored, it is necessary to copy the data of that page to the snapshot volume, which deteriorates the access performance to the master volume.
Furthermore, recently, there is introduced a system in which the snapshot volume is built using a magnetic disk device of ATA (AT attachment) standard in order to perform snapshot at low cost. In the case of such system, it is anticipated that the access performance to the master volume is extremely deteriorated.
Furthermore, as described above, the snapshot volume holds only the updated page of the master volume after receiving the snapshot command. Thus, when there is a read command including the pages other than the updated page being issued to the snapshot volume, there deteriorates the access performance to the master volume.