A storage apparatus is controlled according to a RAID (Redundant Array of In-expensive Disks) system, and includes a plurality of hard disk devices (hereinafter, also referred to as HDD: Hard Disk Drive) disposed in an array arrangement, and a controller controlling the hard disk devices. The storage apparatus is connected to a host computer such as a server via a data link such as a SAN (Storage Area Network), and provides a logical storage area (hereinafter, also referred to as a logical volume) furnished with redundancy on the basis of a RAID configuration.
A conventional storage apparatus requires that a storage capacity to be required in the future is preliminarily determined in a designing stage and assigned to a logical volume. Accordingly, it is required to preliminarily purchase HDDs which are not used in actuality, causing tendency of increasing a cost burden of a user of the storage. PTL 1, which will be cited below, provides a technique referred to as thin provisioning to solve this problem. This technique provides a host computer with a virtual volume, which is a virtual storage area, and dynamically assigns a required amount when data writing is actually caused by the host computer.
The technique described in PTL 1 prepares a pool area, which is a storage area, in a storage apparatus, and a plurality of host computers share a plurality of virtual volumes through the pool area. On issuance of a writing request by the host computer, a storage area required for writing of data is assigned to the virtual volume. The storage apparatus employing this technique is capable of flexibly extending the capacity of the pool area by installing an additional HDD as necessary.
In an environment where the storage capacity is virtualized using the thin pro-visioning technique, the storage apparatus assigns a page from the pool area responsive to a writing request issued by the host computer. The pool area is an aggregate of storage capacities including at least one RAID group. The page is a storage capacity unit assigned by the storage apparatus from the pool area as an access target of the host computer.
In the volume virtualization environment where the storage capacity is virtualized, the storage apparatus provides the host computer with a virtualized logical volume (virtual volume), and further configures virtualized pages in the virtual volume. The virtual page is associated with any one of pages. The page may also be referred to as a real page for the sake of discriminating the page from the virtual page. In the environment where the storage capacity is virtualized, the host computer accesses the virtual page. Since the virtual page is associated with any one of the real pages, the host computer accesses the real page via the virtual page. The storage apparatus typically holds information, such as an access frequency, for each real page as a unit.
On the other hand, PTL 2, which will be cited later, describes a data deduplication technique for a storage apparatus. This technique suppresses increase in amount of data to be stored in the storage apparatus for backing up and archiving business data and the like, and does not store redundant data in the storage apparatus in order to improve data capacity efficiency. The data deduplication is a technique that does not write data that is to be finally redundant into a HDD if data to be newly stored in the HDD, so-called writing data, is identical with data already stored in the HDD. Further, alternatively, after writing data is temporarily written into the HDD, this technique asynchronously verifies whether the writing data is identical with data having already been stored or not; if the data is identical, the redundant data is deleted. In order to verify whether the writing data is identical with the data having already been stored into the HDD or not, a fast search employing hashing is typically utilized.