In the case where a computer handles large amounts of data, a low-speed and large-capacity storage device, such as a hard disk drive (HDD), is often employed as non-volatile storage for storing the data. However, if access is made to the low-speed storage device each time an access request is issued, the data access may act as a bottleneck and decrease the processing performance of the computer. One solution to this problem is to use high-speed random-access memory, such as random access memory (RAM), as cache memory.
For example, there has been proposed a data management apparatus configured to store data in a HDD by grouping a plurality of unit data pieces into each “segment” and, then, cache the whole segment from the HDD to RAM. Upon receiving a read request with designation of a unit data piece, the data management apparatus loads an entire segment including the designated unit data piece from the HDD into the RAM. The unit data piece loaded (cached) into the RAM is left without being discarded immediately. Later, upon receiving a read request with designation of a unit data piece being cached, the data management apparatus provides the designated unit data piece by reading it not from the HDD but from the RAM.
In addition, the data management apparatus records the history of read requests and analyzes the association among unit data pieces likely to be read successively. The data management apparatus changes the arrangement of unit data pieces in the HDD in such a manner that the unit data pieces likely to be read successively belong to the same segments. This increases the likelihood that designated unit data pieces have already been cached in the RAM and therefore reduces access to the HDD, thus improving the access performance.
See, for example, International Publication Pamphlet No. WO 2013114538.
It is sometimes the case that a plurality of relocation process options for changing the arrangement of some unit data pieces in a storage device are generated in a short amount of time. When a plurality of relocation process options are generated, they may be sequentially evaluated to determine whether to execute each of the relocation processes. Assume, for example, that the following two relocation processes are generated as options: a relocation process #1 for transferring unit data #1 close to unit data #2; and a relocation process #2 for transferring the unit data #2 close to unit data #3. The former relocation process may be generated when access has been made successively to the unit data #1 and #2, while the latter relocation process may be generated when access has been made successively to the unit data #2 and #3. In this case, the relocation process #1 is evaluated based on the current unit data arrangement, and then executed if the relocation process #1 is highly evaluated (i.e., if transferring the unit data #1 is determined to promote access efficiency). Subsequently, the relocation process #2 is evaluated based on arrangement obtained after the execution of the relocation process #1. In this regard, the relocation process #2 is evaluated in consideration of the disadvantage of the unit data #2 being moved away from the unit data #1.
On the other hand, the method of sequentially evaluating a plurality of relocation processes and determining whether to execute each of the relocation processes may take a long time until examination of all the relocation processes is completed. In view of this, it may be considered reasonable to perform the evaluation and execution of a plurality of relocation processes in parallel. However, in the case of evaluating a plurality of relocation processes in parallel, interference between/among relocation processes is going to be a problem. Assume, for example, the case in which both the relocation processes #1 and #2 above are highly assessed in individual evaluations based on the current unit data arrangement. When both the relocation processes #1 and #2 are executed, however, the effect expected from the relocation process #1 may not be obtained because the unit data #2 is separated away from the unit data #1. If this is the case, the final unit data arrangement is likely to have a low evaluation. This may be said that the relocation process #1 is subject to interference from the relocation process #2. That is, additivity is not always observed in the evaluation values of a plurality of relocation processes and there is, therefore, little point in simply combining relocation processes individually having high evaluation values.
To manage this problem, it may be considered appropriate to calculate, for each of all possible combinations of the relocation processes, unit data arrangement to be obtained after the execution of the relocation process combination and, then, directly evaluate the calculated arrangement, to thereby find an optimal relocation process combination. In this case, however, assuming that N relocation processes have been generated, a maximum of 2N (2 to the power of N) combinations need to be evaluated, possibly needing a huge computational effort. In addition, it may be considered appropriate to execute, among the relocation processes, only a small number of relocation processes clearly causing no interference to each other. This, however, reduces the number of relocation processes selectable at the same time, possibly inducing a decrease in the accuracy in optimizing data arrangement.