A large volume of data dealt with by a server apparatus such as a business server is managed by a storage system including a storage apparatus having a large memory capacity, for example. Input/output (I/O) of data is processed by using a logical memory area (logical area) and a memory area (physical area) of a memory device mounted to the storage apparatus in the storage system.
The memory device includes, for example, a recording medium such as a hard disk drive (HDD) or a solid state drive (SSD), a device based on redundant arrays of inexpensive disks (RAID) using a plurality of recording media in combination, or the like. A technology for integrating physical areas of a plurality of memory devices to be used as a single virtual physical area (storage pool/disk pool) is also proposed. In the case of a storage apparatus having a cache installed, the cache may be temporarily used as the physical area in some cases.
An access to data by the server apparatus is executed via the logical area. When data write is performed with respect to a certain logical area, the storage apparatus writes the data at an address (physical address) of the physical area corresponding to an address (logical address) of the logical area at which the data is to be written. At this time, even when other data having the same contents as the newly written data exists in the physical area, in a case where the logical address at the write destination is different from the logical address of the other data, the data is written at a physical address different from that of the other data.
Since the capacity of the physical area provided by a hardware resource such as the HDD or the SSD is limited, a method of efficiently using the physical area by associating a plurality of logical addresses with the same physical address is proposed in a case where the data having the same contents is written at the plurality of logical addresses. This method may be referred to as duplication exclusion in some cases.
A method of determining the presence of the same data by using hash values of data at the time of snapshot creation is proposed in a storage system where a snapshot of the memory area at a certain time point is created. According to this method, in a case where the same data exists, the data already present at the time of the snapshot creation is used. In addition, a method of excluding the duplicate data in accordance with a comparison result by comparing mutual data in a case where the hash values are the same is proposed.
Japanese Laid-open Patent Publication No. 2010-72746 and Japanese Laid-open Patent Publication No. 2009-251725 discuss related art technologies.