Soft errors in storage nodes of an integrated circuit cause persistent corruption of the state of the integrated circuit. To prevent improper operation of the integrated circuit, the soft errors should be detected, isolated, and corrected. After detecting a soft error, the integrated circuit can be reset or reconfigured to isolate and correct the soft error. However, resetting the integrated circuit can require a significant amount of time in total system recovery.
High availability is required in certain applications. For example, 99.999% availability or about five minutes of down time per year is often required in telecommunication applications. For some integrated circuits, soft errors produce an amount of system reset time that exceeds the permitted down time. There is a general need to limit the down time during recovery from soft errors.
One or more embodiments of the present invention may address one or more of the above issues.