There is a trend that modern automotive ECUs integrate more and more functionality. On one side, this trend is driven by technology scaling which enables ever increasing levels of integration. Moreover, also the highly cost driven nature of the automotive industry forces developers to reduce the total number of ECUs per vehicle.
In this context, electronics play an increasing role in providing advanced driving assistance functions that particularly help to prevent hazards and reduce the number of fatal injuries.
The integration of assistance functions inside an ECU is mainly concentrated around a multi-CPU microcontroller that plays a decisive role by hosting the critical computation and control functions. Such a multi-CPU microcontroller may be regarded as a cluster of computation nodes with defined and encapsulated tasks.
Under such assumptions—i.e. that a plurality of critical computation and control functions related to various assistance functions are performed by the same multi-CPU microcontroller—early detection of latent faults is a main concern to address to avoid issues where the operation of the multi-CPU microcontroller is actually affected.
A typical cause of such errors may be the corruption of CPU registers by effects such as alpha particle strike, power supply spikes or the like which may be summarized as latent faults. As a result, any such corruption will lead to unexpected operation of the CPU when the register contents are next used. This is based on the fact that a typical CPU contains many registers holding the current state of the CPU which determines its future operation. Consequently, embodiments aim at a method and system for detecting corrupted registers prior to use of these registers by the CPU.
Moreover, the CPU registers may be architectural registers that are visible to a program running on the CPU or may be “hidden” registers used by the CPU to control operation but not visible to a program, e.g. registers in the branch prediction tables.
As both types of registers may see long periods of time between accesses during which they are susceptible to corruption, the susceptibility of the corresponding CPU to latent faults increases.
Typical solutions for detecting corrupted registers comprise the following solutions. A first known solution is based on a read out of architecturally visible registers by a program running on the CPU and comparing the value with a known good value held elsewhere in the system. This requires a known good value to be available. However, for registers that are dynamically updated such a value may not be available.
A further problem with this solution is that having a program read out the architectural state is invasive and will consume CPU resources to perform. Moreover, hidden registers are typically not visible to the program and hence cannot be compared.
A second known solution is based on the use of at least two lockstepped CPUs. Lockstepped CPUs allow corruption to be detected when such corruption leads to the operation of the two CPUs diverging. However, this diverging operation may be detectable only some time after the corruption actually occurred and then it may be too late for the system to recover from such corruption.
For these or other reasons, there is a need for the present invention.