1. Technical Field
A “code optimizer” is related to optimizing XOR-based codes for encoding and decoding of data, and in particular, to various techniques for optimizing generic XOR-based codes using a unique “common operations first” (COF) approach that enables increased coding efficiencies through optimization of existing XOR-based codes having arbitrary levels of fault tolerance.
2. Related Art
Erasure correcting codes are often adopted by storage applications and data transmission applications to provide fault tolerance. One simple example of conventional fault-tolerant storage is a conventional RAID array of hard drives. In a typical RAID array, complete recovery of encoded data stored within the array is possible given the failure of one or more nodes (i.e., the individual hard drives in the array), depending upon the RAID level being used. In the data transmission scenario, fault-tolerant data transmission typically involves some level of redundancy in transmission of data packets such that if one or more packets is lost or overly delayed, the underlying message can still be reconstructed without error.
For conventional fault-tolerant storage applications, encoding and decoding complexity is a key concern in determining which codes to use. Conventional XOR-based codes use pure XOR operation during coding computations. As such, implementation of XOR-based codes is very efficient in both hardware and software. Consequently, such codes are highly desirable in fault-tolerant storage applications. Further, as is known to those skilled in the art, XOR-based codes can be implemented by transforming any existing code defined in finite fields to an XOR-based code.
For example, one conventional XOR-based coding technique constructs XOR-based codes from Reed-Solomon codes to protect packet losses in communication networks. Reed-Solomon codes are both well known and widely used by those skilled in the art of data encoding and decoding. One of the advantages of Reed-Solomon codes is that they are both flexible in coding parameters and capable of recovering from a maximum number of failures (the MDS or “Maximum Distance Separable” property). For these reasons, Reed-Solomon codes would appear to be natural choices for fault-tolerant data storage applications. However, the common understanding and teachings in the art have previously assumed that XOR-based Reed-Solomon codes are inefficient. This belief that generic Reed-Solomon codes are inefficient leads directly to the general conclusion that such codes are generally inappropriate for storage applications where efficiency is an important concern since efficiency directly corresponds to encoding and decoding speed, and thus to the overall performance of the storage system.
For these and other reasons, rather than use Reed-Solomon codes for fault-tolerant storage applications, the conventional approach over many years has been directed towards designing specific XOR-based codes for particular applications. Unfortunately, one problem of specifically designed XOR-based codes is that they are generally not very flexible. For example, XOR-based codes providing 2- or 3-fault tolerance (wherein the system can fully recover from 2 or 3 storage node failures, respectively) have been well studied and implemented in a number of conventional storage systems. However, efficient codes offering redundancy higher than 2- or 3-fault tolerance are more difficult to implement, though there are several such coding schemes using specifically designed XOR-based codes.