The present invention generally relates to a storage apparatus and a method of controlling same, and, for instance, can be suitably applied to a storage apparatus equipped with a data compression/deduplication function.
Conventionally, storage apparatuses capable of saving large volumes of data at low cost have been in demand. In order to meet this demand, technology which performs lossless compression on data (hereinafter simply called compression) and records the data is known. By recording to a storage device after reducing the data size using compression, larger volumes of data can be stored in a storage apparatus than when the data is recorded to a storage device without being compressed. As a result, the costs of holding data such as the bit costs of a storage device and the power consumption costs of a storage apparatus can be reduced.
The post-compression data size of data is different depending on the data content and even when the data size of pre-compression data is the same, the data size of the post-compression data is not necessarily the same. For this reason, when data which has been compressed and recorded to a volume is updated, the pre-update data can sometimes not be overwritten with the post-update data.
Hence, PTL 1 discloses, when data which has been compressed and recorded to a volume is updated, writing the post-update compressed data to a volume separately from the pre-update data.
Note that, when post-update data is written to a volume separately from the pre-update data, the pre-update data remains in the volume and this data is unnecessary. For this reason, in a storage apparatus which is equipped with a compression function, processing known as garbage collection which discards this unnecessary data (hereinafter called garbage) is executed regularly.
Meanwhile, in a storage apparatus, deduplication technology exists as another technology for reducing the volume of data to be stored in a storage area of the apparatus. Deduplication technology includes a technology which, when a plurality of data of the same content exists in a storage apparatus, keeps only one of this plurality of data in a storage device in the storage apparatus and does not store the remaining data in the storage device.
Deduplication technology can also be used in conjunction with compression technology. For example, PTL 2 discloses a storage apparatus which performs deduplication processing on duplicate data among data that has been transferred from a host device and compresses data that has not been deduplicated.
As timing for performing compression/deduplication, there exist a control system (hereinafter called an inline system) which executes compression/deduplication processing of data synchronously with I/O (Input/Output) from a host device, and a control system (hereinafter called a post-process system) which executes compression/deduplication processing of data asynchronously to I/O from the host device.
An inline system executes compression/deduplication processing before sending an I/O response to the host device, and therefore reduces system performance (response performance and throughput performance) but is advantageous in that the result of data deletion resulting from compression/deduplication is obtained immediately and therefore the storage capacity to be prepared for the storage apparatus is simply the data amount after the compression/deduplication processing.
However, a post-process system executes compression/deduplication processing after sending an I/O response, and is therefore advantageous in that the system performance improves, but is disadvantageous in that a storage area for temporarily saving data that has not undergone compression/deduplication processing is required in addition to the storage area for saving the post-compression/deduplication data, and hence a proportionately larger storage area is required.
[PTL 1] International Publication No. 2017/141315
[PTL 2] Japanese Published Patent Specification No. 5216915