In accordance with development and spread of computers in recent years, various kinds of information are digitalized. A device for storing such digital data is, for example, a storage device such as a magnetic tape and a magnetic disk. Because data to be stored increases day by day and reaches a huge amount, a mass storage system is required. Moreover, it is required to keep reliability while reducing the cost spent for a storage device. In addition, it is also required to be capable of easily retrieving data later. Thus, a storage system is expected to be capable of automatically realizing increase of storage capacity and performance, eliminating duplicate storage to reduce storage cost, and working with high redundancy.
Under such circumstances, a content-addressable storage system has been developed in recent years as shown in Patent Document 1. In this content-addressable storage system, data is distributed and stored into a plurality of storage devices, and a storage location where the data is stored is specified by a unique content address specified depending on the content of the data. Some content-addressable storage systems divide predetermined data into a plurality of fragments and store the fragments, together with fragments to become redundant data, into a plurality of storage devices, respectively.
The content-addressable storage system as described above can, by designation of a content address, retrieve data, namely, fragments stored in a storage location specified by the content address and restore the predetermined data before division by using the fragments later.
The content address is generated based on a value generated so as to be unique depending on the content of data, for example, based on the hash value of data. Thus, in a case where there is duplicate data, it is possible to acquire data of the same content by referring to data in the same storage location. Therefore, it is unnecessary to store separately the duplicate data, and it is possible to eliminate duplicate recording and reduce the volume of data.
In particular, a storage system which has a function of eliminating duplicate storage as described above compresses data to be written, such as a file, by dividing into a plurality of block data of predetermined volume and then writes into storage devices. By thus eliminating duplicate storage in units of the block data obtained by dividing a file, a duplication rate is increased and the volume of data is reduced.
For the best deduplication in the storage system as described above, it is optimum to compare all block data of all files and obtain the most duplicated block data. However, realization of such a process requires an extremely huge amount of calculation. Therefore, in division of data to be stored described above, a data variable division method using finger print is employed, for example. This method is a processing method of, at the time of storing data of similar content, calculating a finger print value from the beginning of the data so that the same place in the data becomes a data division point, and dividing the data at a place having a specified finger print value. Such a method of dividing data by using a finger print value has a merit that it requires a small amount of calculation, but has a problem that, if a division point is improper, a data deduplication rate at and after the point becomes low.
On the other hand, with regard to timing for data division, there are two methods. One is the post-process method of once writing all data onto a disk and then executing data division from the beginning of a file. The other is the inline method of executing data division in real time when writing data onto a disk. According to the post-process method, data division is executed after all data are written in, so that it is possible to execute data division with stability. However, the post-process method has a problem that load on a disk is high because the number of times of I/O of the disk is three times that of the inline method. Therefore, many products employ the inline method.
[Patent Document 1] Japanese Unexamined Patent Application Publication No. JP-A 2005-235171
However, a storage system which divides data by the inline method described above has a problem that a deduplication rate becomes low because, when a sequence of data writing from a client changes, a division point of data at the time of data writing onto a disk changes. Change of the data writing sequence is, for example, change of the order that a client sends data in a file to the storage, occurrence of a commit, and so on.
In particular, in a case where a commit occurs during data backup, there is a need to write data transmitted from a client to the storage system onto physical disks at the moment, so that the beginning and the end of data existing on a data buffer at the time of occurrence of the commit are set as data division points, and the data is divided into block data and written in. Thus, data is divided regardless of a finger print value. Consequently, for example, data division points become different even when data of the same contents are backed up, so that it becomes more probable that the contents of divided block data are not considered to be identical, and a deduplication rate becomes lower.
Referring to FIGS. 1A to 2B, a specific example of the abovementioned problem will be described below. First, an example in FIGS. 1A and 1B shows an operation in a case where, while a client 500 is backing up a file F into a storage 400 (see arrow B), a commit occurs in a state that data F100 exists in a data buffer 410 as shown in FIG. 1A. When a commit occurs (see arrow C), the data F100 existing in the data buffer 410 at the moment is divided into block data F101 and block data F102 in an inline data division process 420 (see arrow D), and then, written as the block data F101 and the block data F102 onto physical disks 430 (see arrow R).
After that, as shown in FIG. 1B, in a case where a commit occurs again in a state that data F200 following the data F100 exists in the data buffer 410 (see arrow C), the data F200 is divided into block data F201 and block data F202 in the inline data division process 420 (see arrow D), and then, written as the block data F201 and the block data F202 onto the physical disks 430 (see arrow R).
In the above case, a division point P of the block data F102 is not a division point properly set with a finger print value but a point where the data is forcibly divided. Therefore, it is probable that block data F102, F201 and F202 are not data divided with an optimum finger print value.
Further, an example in FIGS. 2A and 2B shows an operation in a case where, while the client 500 is backing up a file F into the storage 400 (see arrow B), a commit occurs in a state that data F100 and data F200 that does not follow the data F100 and is separated therefrom exist in the data buffer 410 as shown in FIG. 2A. Such an example may occur in a case where data are transmitted and received between a client and the storage by using a protocol (e.g., NFS) which does not ensure that the order of writing file data is the offset order at the time of file backup.
When a commit occurs in the abovementioned state (see arrow C), the data F100 and the data F200 that exist in the data buffer 410 at the moment are divided into block data F101 and F102 and block data F201 to F204, respectively, in the inline data division process 420 (see arrow D), and then, written as the block data F101 and F102 and the block data F201 to F204 onto the physical disks 430, respectively (see arrow R).
After that, as shown in FIG. 2B, when a commit occurs again in a state that data F300 located between the data F100 and F200 exists in the data buffer 410 (see arrow C), the data F300 is then divided into block data F301 to F303 in the inline data division process 420 (see arrow D), and then, written as the block data F301 to F303 onto the physical disks 430 (see arrow R).
In the above case, a division point P of the block data F102 is not a division point properly set with a finger print value or the like but a point where the data is forcibly divided. Therefore, it is probable that the block data F102, F301 to F303, and F201 to F204 are not data divided with an optimum finger print value.
As described above, the storage system having the function of eliminating duplicate storage has a problem that a deduplication rate decreases due to change of division points of data even when data of the same content are backed up.