In recent years, deduplication technology has been widely used. In general, the deduplication technology is technology of reducing capacity consumption by deleting data other than one data among a plurality of duplicated data. Checking whether duplicated data are present for every data requires a large amount of calculation. Thus, it is a common practice to perform calculation for each data by using a hash function to calculate a representative value of data such as a hash value, and perform a comparison process only between data items in which the representative values match with each other. The method of calculating the representative value is not limited to the method using the hash function, and any calculation method can be employed as long as the values calculated from duplicated data are always identical. The representative value such as a hash value used for the deduplication technology is hereinafter referred to as “fingerprint”. The fingerprint is registered in management information such as a table.
In general, the number of fingerprints held by a storage system increases along with an increase in storage capacity. When the storage system has many fingerprints, the performance of the storage system decreases. This is because the size of fingerprint management information in which fingerprints are registered is large and the search range of fingerprints is large and because the number of updates of the fingerprint management information is large. For example, when the fingerprint is 128 bits for data of 4 KB, fingerprints of 4 TB need to be registered in the fingerprint management information for data of 1 PB. Offloading the calculation of fingerprints and the update of the fingerprint management information to hardware can require expensive hardware capable of high-speed processing, with the result that the cost of the storage system can increase.
In PTL 1, an anchor exists for a part of a data set, and the anchor is specified from the data set. When the specified anchor does not exist in an anchor database, the specified anchor is stored in the anchor database.