The present invention relates to data storage technology, and more specifically, to a method and apparatus for data storage.
Data Input/Output (IO) rate is one of the main performance indicators of storage devices. Modern storage devices are generally heterogeneous storage devices, i.e., containing storage devices having different storage mediums. The most common storage medium is Hard Disk Drive (HDD) based on magnetic disk technology and Solid State Disk (SSD) based on flash memory technology. Data IO rates that may be supported by HDDs are limited by the rates of mechanical operations, such as disk rotation and magnetic head movement. SSD has much higher data IO rates than data IO rates of HDD because mechanical operations such as disk rotation and magnetic head movement are avoided. On the other hand, SSD has a higher cost than HDD, and thus may be merely suitable for storing smaller amounts of data. A storage device may further include a storage controller for controlling data allocation among different storage mediums.
For such heterogeneous storage devices, traditional optimization methods for homogeneous storage devices are not applicable. A homogeneous storage device has the same storage media, while a heterogeneous storage device has different storage mediums with significant performance and cost differences. Thus, it may be necessary to allocate data with different properties to different storage mediums based on a comprehensive consideration of data properties, storage medium performance, and costs of the storage mediums, to achieve balance between performance and cost, thereby improving storage efficiency. As a fundamental principle, a small amount of data having a higher access frequency should be stored on an SSD, and a large amount of data having a lower access frequency should be stored on an HDD. Whether specific data should be stored on SSD or HDD may be determined by a system administrator based on experience. In addition, with variances in the data access frequency, data may be reallocated between HDD and SSD, that is, data having a higher access frequency may be reallocated from HDD to SSD, and data having a lower access frequency may be reallocated from SSD to HDD.
Taking the complexity of data stored in a storage device into consideration, simply allocating data based on the above fundamental principle may not improve storage efficiency effectively. Thus, a new method for allocating data among different storage mediums of a storage device is desired.