The present invention relates to data storage, and more specifically, this invention relates to systems, computer program products and methods for improving the placement of blocks, e.g. data blocks and parity blocks, in a deduplication-erasure code environment while ensuring/preserving erasure code semantics.
Data deduplication is a storage concept involving the elimination or reduction of redundant/duplicate data stored or communicated in a data storage system. Accordingly, data deduplication facilitates a reduction in storage requirements as only unique data is stored.
Data deduplication may be performed on in-band traffic, e.g. in real time as data is being communicated and/or written to a storage device, and is referred to as inline deduplication. Data deduplication may also occur in post-processing, which occurs after data is transferred to and/or stored upon a storage device. While post-processing data deduplication may shorten the time for data transfer, it requires a full backup to be stored temporarily, thus defeating the storage benefits of deduplication.
Additionally, data deduplication may be performed at the file or at the block (e.g. sub-file) level, thereby defining the minimal data fragment that is checked for redundancy. File level deduplication typically eliminates identical files within or across systems. Conversely, block deduplication looks within a file and saves unique iterations of each block or bit. For instance, with block level deduplication, a file may be split into blocks before being written to and/or stored upon a storage device, and each individual block associated with the file that is found to be a duplicate (e.g. redundant) may be deduplicated.
As mentioned above, in conventional deduplication techniques, an incoming data segment may be divided into sub-units of a desired granularity, e.g. blocks, files, etc. A hash value for each defined sub-unit may then be calculated, where each hash value uniquely identifies the data content of its respective sub-unit. The calculated hash values may then be compared to hash values for existing sub-units already stored in the data storage system, e.g. stored in a storage pool comprising one or more storage devices. For each sub-units having duplicate thereof (e.g. a file or block whose calculated hash value matches a hash value for a stored file or block, respectively), the data storage system may store and/or communicate a reference to the stored sub-unit instead of communicating and/or storing the duplicate sub-unit. Non-duplicate sub-units, e.g. sub-units whose calculated hash values do not match any hash values for the stored sub-units, may be communicated and/or stored to one or more storage devices in the data storage system.
However, deduplication techniques may alter the reliability of data segments. For example, the reliability of a given data segment is based on the reliability of its constituent sub-units, e.g. files, blocks, etc., which may be divided across a plurality of storage devices. Accordingly, data storage systems may implement certain recovery schemes to ensure efficiency (e.g. with regard to the amount of any extra data generated) and reliability (e.g. the extent to which the recovery scheme can recover and/or access lost or compromised data).
Replication is one such recovery scheme, where data is replicated two, three, or more times to different storage devices in the data storage system. A disadvantage with the replication technique is the cost of storing replicate data. Large storage costs may translate to high costs in hardware, as well as high costs in operating the storage system, e.g. cooling, maintenance, etc. Moreover, preserving reliability with known replication schemes may not be effective. For instance, suppose an application block is divided into 4 original blocks (A1, A2, A3, A4) that will be placed on 4 disks. After one replication, 4 replicate blocks (A1copy, A2copy, A3copy, A4copy) will be placed on 4 additional disks. If the disk storing an original block, e.g. A1, and a disk storing its replicate, e.g. A1copy, both failed, the data associated with the entire application block would be lost, in other words, this exemplary replication scheme may not be able to sustain more than one disk failure.
Another recovery scheme utilizes redundancy parity encoding, which involves the creation/implementation of erasure codes. Redundancy parity encoding typically involves encoding an original data segment in such a way as to generate and embed data redundancies within the data segment. If the data segment is compromised or lost, e.g. from a disk failure, these redundancies would allow recovery of the data or portions thereof. For example, consider a data segment that is divided into in original data blocks. An erasure code, a forward error correction code (FEC) for a binary erase channel, encodes these in original data blocks into n total blocks, where n>m. Thus, the data segment includes the in original data blocks and n−m parity blocks. The erasure code encoding rate (ER) is given by ER=m/n<1. The key property of erasure codes is that all original data can be recovered when there is no more than n−m failures. Stated another way, if any of the n blocks is corrupted, m is the number of verified blocks required to reconstruct/recover the original data. Accordingly, erasure codes provide redundancy without the overhead of strict replication.