When a computer handles a large amount of data, a low-speed high-capacity storage device such as a hard disk drive (HDD) is often used as a non-volatile storage device for storing data. However, if an access is made to such a low-speed storage device each time an access request is issued, data access becomes a bottleneck so that the processing performance of the computer might be reduced. One way to address such a problem may be to use a memory such as a random access memory (RAM) that allows high-speed random access as a cache memory.
For example, there has been proposed a data management apparatus that stores in the HDD a plurality of data blocks grouped into segments, and caches data blocks from the HDD to the RAM in units of segments. Upon receiving a read request specifying a certain data block, the data management apparatus loads the whole segment including the specified data block from the HDD to the RAM. The data blocks loaded (cached) in the RAM are stored without being immediately discarded. Thereafter, upon receiving a read request specifying one of the cached data blocks, the data management apparatus acquires the specified data block from the RAM instead of reading the specified data block from the HDD, and provides the acquired data block.
Further, the data management apparatus records the history of read requests, and analyzes the relationship between data blocks that are likely to be sequentially read. The data management apparatus changes the allocation of data blocks in the HDD such that the data blocks that are likely to be sequentially read belong to the same segment. This increases the likelihood that the specified data block is cached in the RAM. Thus, it is possible to reduce access to the HDD, and thereby improve the access performance.
See, for example, International Publication Pamphlet No. WO2013/114538.
In the data management apparatus described above, however, access to the low-speed storage device might not be reduced due to excessive relocation of data.
Data blocks of a specific pair tend to be sequentially accessed. This characteristic (locality) is not permanent, but may change in accordance with the operation of the information processing system. When the locality changes, the effect of reducing access due to the previous data relocation decreases. That is, the benefits of data relocation last for only a limited period of time, and the amount of the benefits is finite. In the data management apparatus described above, when the locality changes, another pair of data blocks that tend to be sequentially accessed is detected. Thus, data relocation is performed again for the detected pair of data blocks. On the other hand, data relocation often temporarily increases writing to a low-speed storage device, which incurs some cost.
Accordingly, if data relocation is performed each time a new pair of data blocks that tend to be sequentially accessed is detected, benefits that are worth the cost of data relocation might not be obtained. Thus, access to the low-speed storage device might not be reduced.