1. Field of the Invention
The present invention relates to storage clusters and, more particularly, to a storage cluster and method that efficiently store small objects with erasure codes.
2. Description of the Related Art
A storage cluster is a group of hard disk drives that, along with a controller, permanently store digital files, which are often known as objects.
Permanent storage differs from day-to-day storage in that permanent storage must be able to tolerate multiple hard disk drive failures without losing any of the objects that have been stored.
One conventional approach to permanent storage is known as replication. With replication, an object is copied in its entirety onto several hard disk drives. For example, if an object is copied onto three hard disk drives and two of the hard disk drives fail, then the object can be completely recovered from the copy on the third hard disk drive.
Although the statistical likelihood of losing an object can be reduced to near zero, one of the drawbacks of replication is that replication requires a large amount of storage space. For example, if an object is copied onto three hard disk drives, then the effective storage space of the storage cluster is only ⅓ of the total storage space.
Another conventional approach to permanent storage, which requires substantially less storage space than replication, is to store the objects with erasure codes. Erasure codes break an object into k fragments or chunks, which are then encoded (using, for example, a maximum distance separable (MDS) code) into n chunks of the same size, where n is greater than k, and any k chunks of the n chunks are enough to recover the complete object. The n chunks are then stored on n hard disk drives.
One common approach to permanently storing objects with erasure codes is to temporarily store the objects with replication in a number of replication storage spaces on a number of hard disk drives and then, when the system has spare resources or at predefined times, chunk the objects, encode the chunks, and store the encoded chunks on the hard disk drives. With this replicate-then-encode approach, small objects can be collapsed into larger encodes in order to have efficient encoding and hard drive usage.
After the temporarily-stored objects have been chunked, encoded, and stored on the hard disk drives, the replication storage spaces are reused to temporarily store new copies of objects. Since the replication storage space is reusable, the storage space required for replication with this approach is much smaller than the storage space required for straight replication. Although much smaller than straight replication, this replicate-then-encode approach still requires large replication storage spaces.
Another common approach to permanently storing objects with erasure codes is to chunk, encode, and store the encoded chunks on the hard disk drives as the objects are received by the storage cluster. Since no replication is used with this encode-now approach, this approach requires much less storage space than the previous replicate-then-encode approach.
However, one disadvantage of the encode-now approach is that small objects are chunked into very small sizes, encoded, and permanently stored on the hard disk drives within the storage cluster. Very small encoded chunks are undesirable because it is very inefficient to store and repair very small encoded chunks on the long circular tracks of the hard disk drives.
Thus, there is a need for a storage cluster that more efficiently stores small objects with erasure codes than the encode-now approach without requiring substantially more storage space.