Compute clusters, or groups of linked computers, have been broadly utilized to improve performance over that provided by a single computer, especially in complex computations, for example, to execute simulations of complex physical phenomena. As shown in FIG. 1 (PRIOR ART), the elements of the computer cluster, such as computer nodes 10, are linked by a high speed network 12 allowing the sharing of compute resources and memory. Data transfers to or from the cluster elements are performed through the high speed network 12 and are managed by additional compute devices, referred to herein as data servers 14. The computer cluster may assume either the compute state (cycle) or the I/O state (cycle), which are typically mutually exclusive. The process of moving data is carried out during the I/O (input/output) cycle of the computer nodes 10, e.g. the data transfers are executed during a time interval when the compute activity has ceased.
In general, simulations of physical phenomena, and other complex computations, may run for an extended period of time lasting, for example, hours or days. During the execution of a simulation, “checkpoint” data is written into the program so that, if the application software or hardware should fail, it is possible to restore the simulation from the “checkpoint”. This “checkpoint” changes the state of the computer cluster from compute state to I/O state (cycle), so that the cache of the compute nodes 10 are written to the attached servers 14 which place the data in an orderly file system for subsequent retrieval. Since in the I/O cycle no actual computation occurs, it is important to keep the I/O cycle as short as possible to maximize the overall compute duty cycle of the computer cluster.
The ratio of compute elements to servers is often very large and may exceed 1000 in some implementations. The servers file data from multiple computer nodes and assign a unique location for each computer node in the overall file system. Typically, the data is stored on rotating media such as common disk drives 16.
Since multiple compute devices may require access to the servers at the same time, the effective data accesses appear to be largely random as the servers satisfy the requests of the compute devices in the order that the requests are received. Disk drives operate in this scheme as “push” devices, and must store the data on demand. Disk drives do not perform well in satisfying random requests since the heads that record the data have to be moved to various sectors of the drive and this movement takes a great deal of time compared to the actual write or read operation. To “work around” this problem, a large number of disk drives may be utilized that may be accessed by a control system 18 which then schedules disk operations in the attempt to spread the random activity over a large number of disk drives to diminish the effects of disk head movement, aka “seeking”.
The size of compute clusters and the aggregate I/O bandwidths that must be supported may require thousands of disk drives for the control architecture to minimize the duration of the I/O cycle. Even though thousands of disk drives are usually powered to support data transfers, the I/O activity itself occupies only a short period of the overall “active” time of the disk system. Even though the duty cycle of write activity may occupy, for example, less than 10% of the clusters total operational time, all the disk drives nevertheless are powered in expectation of this activity.
It would therefore be beneficial to provide a data migrating technique between the compute cluster architecture and the disk drives in which a reduced number of disk drives is needed for data storage, wherein the shortened I/O cycle of the high performance compute cluster architecture may be attained, and wherein the effective aggregate I/O bandwidths of the disk drive operation may be provided without excessive power consumption.