The present disclosure relates generally to a data storage system, and more particularly to methods and computer program products of selecting a new redundancy scheme for data relocation.
Multi-tiered data storage systems are increasingly being used for reducing storage costs. Each data storage tier includes a different type of storage device, e.g., hard disk drive (HDD), solid state drive (SSD), magnetic tapes, optical discs, and cloud. The different storage devices are not only different in terms of their costs, accessibility, speed, and performance characteristics, but are also different in terms of their reliability characteristics. Therefore, a different redundancy scheme may be required when data is moved from one data storage tier to another to ensure the same level of protection for the data against device failures.
Depending on the applications, data may be frequently moved from one data storage tier to another for reasons of performance, accessibility, security, and reliability. For example, the data stored in an SSD can be accessed much quicker than the data stored in an HDD. Therefore, the data in the HDD may be moved to the SSD to increase the access speed. If the data had a certain redundancy scheme when stored on HDDs, e.g., 3-way replication, it has a certain level of protection assessed by an appropriate reliability measure such as the mean time to data loss (MTTDL) or the expected annual fraction of data loss (EAFDL). When it is required that there be at least one copy of the data on SSD, e.g., for performance reasons, then there are several options available, e.g., move all 3 copies of the data from the HDDs to the SSDs, or move 1 copy to SSD but keep the remaining 2 copies on HDDs. It could even be that the data is stored using a Reed-Solomon (RS) erasure code on SSD with one copy on HDD.
Each of these options results in a different level of protection for the data, a different level of storage efficiency, and a different level of performance. Therefore, it is desirable to have a storage system that can automatically change the redundancy scheme when data is moved from one data storage tier to another by considering various factors including data reliability, storage efficiency, performance, and effort required to re-encode data.
Therefore, heretofore unaddressed needs still exist in the art to address the aforementioned deficiencies and inadequacies.