The present technique relates to an apparatus and method for increasing resilience to faults.
Due to the environments in which data processing systems may operate, components within the data processing system can exhibit faults, and the presence of these faults may result in errors being detected during performance of data processing operations by the data processing system. The faults may for example be caused by radiation or other external events. Considering the example of a storage element, such radiation may result in a particle strike on a bitcell or flip-flop, which can cause a single event upset (SEU) where a single bit of a stored value changes state. Hence, the storage element exhibits a fault, and this can then give rise to an error being detected when the processing circuitry processes data that includes the bit stored in the faulty storage element.
When such errors are detected, dealing with such errors can consume significant processing time and resources, and in some instances it may not be possible to correct the error, which can result in a failure of the system. This may then require even more invasive procedures such as a full system reboot, thereby significantly impacting system availability.
Accordingly, it would be desirable to provide a technique which enabled a system's vulnerability to faults to be reduced.