High integrity software has become commonplace in a variety of wide-ranging applications. For example, many automotive, banking, aerospace, defense, Internet payment, and other applications have critical paths that require validation of safe operation by means of redundancy, diversity or both.
The general approach of guaranteeing safe operation of a critical path is for two algorithms to be computed and the results compared for consistency or plausibility using an independent comparator. Generally, this has been implemented via two different methods. First, on a system that is limited to one active processing channel, two (or more) diverse algorithms can be computed with temporal separation. These results are then cross-checked for consistency or plausibility. Second, on a system with more than one processing channel, identical algorithms can be computed simultaneously, with one algorithm processed on a processing channel (a “core”), and the results compared for consistency. A common subset of the second method is the approach of computing one algorithm on two redundant processing channels, whereby the two processing channels are temporally separated (typically by a few clock cycles). This subset method is desirable because it can be robust against hard and soft error events, such as a disturbance arising from a common cause event, for example an alpha particle strike, because of the slight temporal separation. The hope is that the common cause event would disturb one processing channel in such a way that the computed output differs from the other processing channel. The outputs to these channels are compared by a simple comparator, which can trigger an error event, if necessary.
There are several drawbacks to the temporally-separated multiple processing channel implementation. Delaying the input process into the checker core and the output from the primary core requires a large amount of processing state to be held, which costs silicon area and power. Additionally, the number of delay states required to maintain temporal separation increases as the frequency of implementation increases. Again, more delay states cost additional silicon area and power. Further, the data used by the respective computations must be protected against corruption. Also, the comparator used to check the outputs must be shown to be independent from any common cause failures of the processing channels. The quality of the comparison becomes software dependent because it relies on a disturbance of one processing channel to be different from another processing channel, but ultimately depends on the actual processing state of the machines.
In addition to these concerns, perhaps the most crucial issue in implementation is making the actual executions of the two processing channels as maximally diverse as possible, in order to reduce common cause failures. To guarantee integrity, the application must show that each processing channel is independent, such that common cause failures are minimized and that a failure in one channel does not affect the other(s). Efforts to provide diversity include, among others: using different aspect rations for the silicon areas, using rotated macros of the designs, physically separating the instances, and targeting different process speeds of the actual cores. However, none of these efforts provide guaranteed, complete coverage.