In applications involving the transfer of data from one location to another, it is frequently necessary to make changes to the data format along the way. The data headers and the size of the packets or blocks into which the data is assembled, in particular, may be dictated by such things as memory organization and address boundaries on the one hand and by the applicable communication protocol on the other. The result is that the “natural” block size at the data source and that at the data destination may be incompatible. Consequently, the error-control coding technique used to monitor the integrity of the data at its source cannot be used to protect it all the way to its destination.
The prior-art technique for addressing this problem is to check the integrity of the data at each point at which the format has to be changed and then to re-encode the data using a coding technique that is compatible with the new format. This technique leaves the data unprotected during this transition; that is, there is generally no way of ensuring that the data has not changed between the integrity check and the calculation of the new code check.
An even more serious problem resulting from the need to check the validity of an error-control code each time the data format changes is the concomitant increase in the complexity of the software that is typically involved in shepherding the data from its source to its destination. If the error-control code has to be checked each time the format changes, and, if that check indicates a problem, remedial action has to be taken. This requirement greatly complicates the software, increasing significantly the time needed to develop and test it, reducing its reliability and increasing its run time.
If, in contrast, the error-control code needs be checked only at the destination, the intermediate code checks and diagnostic and recovery software can be eliminated from the main thread of execution. If a code fault is detected at the destination, the associated data block or packet is simply rejected and a separate software routine, external to the main execution thread, is used to diagnose the problem. Since data errors are generally rare and, in the vast majority of cases, are due to transient or intermittent events that are especially difficult to deal with in software, the benefit in avoiding the added performance and reliability costs of having multiple checks for such events is significant.
The conventional solution to the problem of monitoring the integrity of data across format discontinuities is shown in FIG. 1A. Data emerges from the data source 100 and is protected by an error-detecting, or an error-correcting, code. Typically, the codes used are systematic codes; that is, the data is transferred without modification followed by a code check calculated from that data and designed to expose any errors suffered by the data during the transfer. When the data with its appended code check is received at a destination, a verifier 102 regenerates the code check from the data and compares the regenerated code check to the code check received along with the data. If the two code checks are identical, the data is accepted as is. If not, the data is either rejected or corrected, depending on the type of code being used. To protect the data as it is being passed on to its destination (e.g., to a storage medium), the data is then repartitioned into blocks with an appropriate block size by a reformatter 104 and a new code check is calculated for each block by a code check generator 106 and appended to the new data block.
FIG. 1B is a flowchart of the steps needed to implement a representative example of such a procedure. Here, data is received and its integrity checked in step 108. A decision is made in step 110 whether the data passes the integrity check. If the integrity check fails in step 110, a diagnostic and recovery routine is initiated in step 126 in an attempt to identify the problem and take remedial action (e.g., inform the data source that the data was corrupted in transmission).
Alternatively, if the data passes the integrity check as determined in step 110, the data is then reformatted and re-encoded in step 112 for transmission to a data cache where its integrity is again checked in step 114. If, as determined in step 116, the integrity check fails, another diagnostic and recovery routine is initiated in step 128 which is necessarily different from the previous diagnostic routine initiated in step 126 and which results in different remedial action.
Alternatively, if the data passes this second test as determined in step 116, the data is then reformatted and re-encoded a second time in step 118 and passed on to its next destination, in this case a storage unit, where its integrity is checked once again in step 120. A Failure as determined in step 122 forces a third diagnostic and recovery routine to be called in step 130. This third diagnostic routine is specific to the new data format and code and results in yet another remedial response.
Alternatively, if as determined in step 122, as will be true in the vast majority of cases, all the data is found to be valid at each integrity check, it is finally stored in step 124. Still, because of the rare event that a data error could be experienced during any of these transfers, separate diagnostic routines initiated in steps 126, 128 and 130, each adding complexity and introducing potential bugs to the main thread of execution, are required at each step.
Therefore, there is a need for an apparatus and method that can protect data as the data is being reformatted without introducing a large amount of complexity.