With the development of information and communications technology and the expansion of its field of application, an increasing volume of data is being processed by information processing systems. With the increasing variety and number of information processing systems, storages for storing the data processed by the information processing systems are requiring improvements in capacity and performance.
To improve the efficiency of managing the storages, it is often the case that, for example, the data involved in a plurality of systems are concentrated on a single storage. In this case, the data involved in a plurality of systems are stored in a single storage. Thus, the single storage requires a storage capacity large enough to store mass data and a performance high enough to process a large amount of access from the plurality of systems. In recent years, the capacity of HDDs (Hard Disk Drives) is remarkably increasing. The use of HDDs as storage media can meet the requirements concerning the increase in storage capacity. Changing the number of storage media which constitute the single storage can also meet the requirements regarding the storage capacity as needed.
To meet the performance requirements presented to the storages, that is, to achieve access speedup, caches are employed. Although not particularly limited, the caches for the storages are implemented in, for example, DRAMs (Dynamic Random Access Memories). The caches for the storages store copies or updates of the data stored in storage media such as HDDs. Storing frequently accessed data in the caches for the storages (to be also simply referred to as “caches” hereinafter) reduces the amount of access to storage media such as HDDs having an access performance lower than DRAMs and the like. This improves the access performance of the storage system.
A hierarchical cache architecture is known to be available in the storages. This architecture is used to hierarchically organize a plurality of caches having different access times and store frequently accessed data, in turn, from higher-order caches having shorter access times. The hierarchical cache architecture uses an SSD (Solid State Drive) as a storage medium higher in performance than a general HDD. The SSD is implemented in, for example, a NAND (Negative AND) flash memory. The SSD has an access performance lower than that of a DRAM but higher than that of an HDD. The SSD has a capacity smaller than that of an HDD, but is larger in capacity and costs less than a DRAM.
In the hierarchical cache architecture, high-speed storage media typified by SSDs or the like are used for low-order caches. DRAMs or the like are used for high-order caches. The low-order caches store data which exceeds the capacity of any high-order cache (for example, data thrown out of any high-order cache by replacement or the like). In accessing again the data that is thrown out of any high-order cache and stored in any low-order cache, the low-order cache is accessed. This reduces the frequency of access to storage media such as HDDs in the storages, thus improving the access performance. The following processes are executed alternately between the high- and low-order caches:
(a) data hit in any low-order cache is stored in any high-order cache again; and
(b) when any high-order cache has reached its full capacity, data that is acquired from any low-order cache and stored in the high-order cache is thrown out of the high-order cache and stored in the low-order cache again.
In the hierarchical cache architecture, the above-mentioned series of operations of (a) and (b) is repeated. As a result,
the most frequently accessed data is stored in any high-order cache; and
the second most frequently accessed data after the one stored in the high-order cache is stored in any low-order cache.
The caches are efficiently used in this way.
When a plurality of types of data are stored in the caches, partitioning is employed to prevent cache competition between the individual data. See, for example, PTL 1 or NPL 1 for details of the partitioning. The partitioning is also proposed in, for example, caches for CPUs (Central Processing Units). For example, competition between processes (or threads) is prevented by dividing (partitioning) shared caches into partitions serving as process- (or thread-) specific occupied areas and limiting the capacity.
In the caches for the storages, the partitioning means, in most cases, dividing (partitioning) the caches into partitions each serving as an area occupied by a volume (this is a predetermined data set defining the unit of management of storage areas and is also called a “logical volume”) or the like, which is used as a unit.
If access to a given volume results in a cache miss, data corresponding to the access is registered in any cache. Without the partitioning, another volume of data already stored in the cache is thrown out, causing cache competition.
In contrast to this, when the caches are partitioned, if access to a given volume results in a miss within a partition corresponding to this volume, the data (page) is replaced within the partition and no data in another partition corresponding to an area occupied by another volume is thrown out. This prevents cache competition between the volumes.
The applicant of the present invention searched for related patent documents and retrieved PTLs 2 to 5. Among these patent documents, PTL 2 discloses a configuration which exclusively uses caches for a DBMS (Data Base Management System) and caches for a storage device and divides these caches using data to optimize the allocation, thereby efficiently using the data caches of the storage device. PTL 2 discloses: additionally estimating the cache hit count upon an increase in size of storage areas allocated to the divided caches.
PTL 3 discloses a disk array device including a plurality of hard disk units and having the following configuration. A mass memory mounted on a controller module which controls the overall disk array device has a system area managed by an OS (Operating System), a cache area serving as a cache memory, and a table area which stores management/control information relating to the device and whose size is changeable at any point of time. The table area is changed in an active state in accordance with the device state without power ON/OFF and an unused area of the table area is freed to be available as a cache area to appropriately change the sizes of the table area and cache area in an active state during operation.
PTL 4 discloses a configuration which uses a memory for an RAID (Redundant Array of Inexpensive Disks) controller and an SSD. The memory for the RAID controller includes an area for storing cache management information and a temporary cache area to function as a primary cache for an external storage device (HDD) (a cache memory (primary cache) for a host system). The SSD functions as a secondary cache and includes a cache area and an area for storing cache management information.
PTL 5 discloses a configuration for dynamically calculating a probability of an access caused by a hierarchical storage device for each hierarchical level. The hierarchical storage device includes storage devices, which have different access speeds and are hierarchically connected to each other will be accessed. Whether to expand the storage hierarchy is determined by calculating the access speed in the entire hierarchical storage device by an access speed calculation unit using access patterns recorded in an access information management unit and the expanded system configuration. The access speed of the storage device is obtained by multiplying the data transfer speed at each hierarchical level and the proportion of data present at each hierarchical level, calculated from the past access log.