1. Field of the Invention
The present invention relates to storage systems. In particular, the present invention relates to a method for configuring an array of storage units for increasing the number of storage-unit failures that the array can tolerate without loss of data stored on the array.
2. Description of the Related Art
The following definitions are used herein and are offered for purposes of illustration and not limitation:
An “element” is a block of data on a storage unit.
A “base array” is a set of elements that comprise an array unit for an ECC.
An “array” is a set of storage units that holds one or more base arrays.
A “stripe” is a base array within an array.
n is the number of data units in the base array.
r is the number of redundant units in the base array.
m is the number of storage units in the array.
d is the minimum Hamming distance of the base array.
D is the minimum Hamming distance of the array.
In a conventional array, the number of storage units in the array equals the number of data units in a base array plus the number of redundant units in the base array. That is, m=n+r. Most conventional storage arrays use a Maximum Distance Separation (MDS) code, such as parity, or a mirroring technique for tolerating failures. The minimum Hamming distance of the base array using an MDS code equals one plus the number of redundant units in the base array (i.e., d=1+r). For a mirror configuration, the number of redundant units in the base array equals the number of data units in the base array (r=n=1), and the minimum Hamming distance is d=2.
It is possible to anamorphically encode an array over m storage units, which is greater than the number of data units n in the array plus the number of redundant units r in the array, that is, m>n+r. In the literature, when an anamorphical encoding is used for arranging parity blocks for performance, such an encoding is typically referred to “de-clustering parity.” As used herein, such an encoding scheme is referred to as an anamorphic encoding scheme because it more accurately identifies that the encoding scheme can provide new properties for an array.
Anamorphism is achieved by selectively arranging a set of base arrays within an array. For example, consider the exemplary array 200 shown in FIG. 2 that uses a four-element code. Array 200 includes six storage units D1-D6 depicted in a columnar form. For array 200, m=6. Array 200 also includes several base arrays that are each formed from n data units plus r redundant units. That is, for each base array, n+r=4. The respective base arrays are numbered sequentially as stripes 1-3 in FIG. 2 to indicate that the four-element code of array 200 is spread across storage units D1-D6. There are four blocks in each stripe and each stripe acts as an independent base array. The minimum distance of the array is, accordingly, the minimum of all the minimum Hamming distances of the respective stripes, that is, D=min(di), where Di is the minimum distance of stripe i.
As configured, anamorphic array 200 can tolerate the loss of at least r storage units of a set of m storage units without loss of data, instead of exactly r units from a set of n storage units. Thus, if r=2 and the code used is MDS, then any two storage units can fail without loss of data. A stripe will fail if any three of its elements are lost. There are, however, some combinations of three-unit failures that can be tolerated by anamorphic array 200. For example, if storage units D1, D3 and D5 each fail, two elements of stripe 1, two elements of stripe 2 and two elements of stripe 3 are lost, but no stripe has lost three elements. Anamorphic array 200 is, thus, over-specified and may be advantageously exploited.
What is needed is a technique that enhances the minimum Hamming distance of an ECC when it is used with an anamorphic array of storage units, and thereby increases the effective minimum distance of the array.