Users want to conveniently and reliably store and communicate electronic data. Data that is stored or communicated may be error-free or may experience errors. More and more data is stored by users in offsite locations. In many instances, the offsite location is a cloud-based storage system. Conventionally, a cloud-based storage system may include a plurality of hard disk drives, tape drives, or combinations of hard disk drives and tape drives. Therefore, users have developed different techniques to protect data stored to cloud-based hard disk drives and tape drives. For example, data may be protected against storage media failures or other loss by storing extra copies in additional locations or by storing additional redundant information from which data may be reconstructed. One type of redundancy-based protection involves using error correcting codes (ECC). Error correcting codes create additional redundant data to produce code symbols that protect against ‘erasures’ where data portions that are lost or corrupted can be reconstructed from the surviving data. Adding redundancy introduces computing overhead to produce the codes and also introduces overhead for additional storage capacity or transmission bandwidth, which in turn adds cost. The overhead added by error correcting code processing tends to increase as the protection provided increases.
Cloud-based storage systems may employ flash memory organized in solid state drives (SSD). SSDs are typically faster than hard disk drives and tape, but slower than random access memory (RAM). As the cost of SSD technology decreases, more and more SSDs are being added to cloud-based storage systems, both as a supplement to hard disk drives, and as a replacement for hard disk drives. For example, a cloud-based storage system may include a combination of hard disk drives for short-term storage, tape drives for long-term storage, RAM to store indexes and buffers, and SSDs to act as a bridge between the RAM and hard disk. In other examples, the primary storage medium for a cloud-based storage system may be an SSD or a plurality of SSDs. SSDs may be comprised of NAND type flash memory.
Solid-state devices that include flash memory operate under different principles than disk drives or tape drives. SSDs are typically organized using blocks of pages. A page may be a smaller plurality of bytes. For example, a page may be 512 bytes or 16 k bytes. A block may contain a number of pages. For example, a block may include 8, 16, 32, 128, or other numbers of pages. Unlike hard disks, which can read and write from individual cells, SSDs support only page reads and writes. When writing data, SSDs support sequential writes within a block, unlike the more random write access available to disks. Similarly, SSDs only support block erases, unlike hard disks which may erase individual cells.
Consequently, SSDs fail differently than disk drives or tape drives. Disk drives and tape drives may experience mechanical failures among their large number of moving parts which may render the entire drive useless. SSDs may fail, for example, when individual flash memory locations reach a threshold number of writes, or when writes to one location in the SSD corrupt other, physically close locations. The SSD failure may not render the entire device useless. Conventional cloud-based storage systems may employ different fixed approaches that are optimized to protect data stored to hard disk drives and tape drives. These approaches may include ECCs tailored for use with data stored to hard disks or tape drives. However, conventional systems using these fixed approaches may not be optimal for use with SSDs. In a NAND based SSD, a write failure typically affects the entire page because an SSD can only erase, write or read at the page level. Contrast this with a hard disk, which, if a portion within a sector fails, the remaining ECC bytes read from the sector may be used to correct the defective bytes. Other sectors on the disk are not affected. Therefore, a data protection approach tailored to a hard disk may be sub-optimal for an SSD.