Electronic functions in automotive applications are often simply turned off in case of detected problems. In electronic control units (ECUs) the microcontroller is turned into a safe passive state if a safety critical fault has occurred. This is known as fail-safe state. Some safety critical faults are non-permanent, but request a shutdown of the whole car electronic network and a new ignition cycle to restart the system.
Another approach is to perform the restart of the electronic function while the car is in operation. Such systems execute a reset followed by a self-test and then try to restart operation. This procedure typically takes several hundred milliseconds in best case and in worst-case the operation does not resume. Thus even in best case the time until the operation resumes might be too long for many applications, e.g. for power steering.
There are realizations of microcontrollers used in automotive applications that are immediately turned into a safe passive state on the occurrence of any safety relevant issue. Due to high software complexity, many of these issues are caused by software bugs which are detected by watchdogs or other protection mechanisms. Other root causes are hardware related, but only a minority of them is triggered by a permanent damage called latent fault.
The acceptance of such error handling has recently changed, i.e. it is no longer accepted that the whole ECU is turned into a save, passive, off state. For safety critical ECUs a backup system is now requested, which shall provide some kind of operation until e.g. the car can get parked safely or until the normal operation can get resumed. Typical names for such operations are called limp home, or limp aside, or backup operation.
A fail-operational system solution or system with enhanced availability provides an acceptable level of performance for safety critical functions even in case of the occurrence of a fault. Reasons for switching to this operation mode could be a latent hardware fault, a sporadic non latent hardware fault or a software fault.
One approach is to use one single controller but to duplicate the most critical parts, which are those parts that have the highest Failure-In-Time (FIT) rates. Another approach is to use a multicore controller including at least one second CPU core and a separate set of peripherals. The second CPU core and peripheral set take over the operation once the first CPU core of the multicore controller detects an issue.
The most expensive solution would be to duplicate or triple the whole electronic control system using two or three ECUs, multiple power supplies and communication systems. Such approaches are too expensive and too space consuming for automotive applications.
Another problem is that most of these solutions are still fighting against errors that are caused by a common source. Especially, if multiple controllers are using the same tools and software, they are potentially having the same systematic problems caused by software bugs.
In addition, such systems contain more components which can fail and the probability that such failures further reduce the reliability or lead to unacceptable high number of situations with reduced operation. As one effect such cars may be forced to go to a service garage too often.
The US 2013/0067259 A1 describes a microcontroller unit which comprises a main-controller and a standby-controller. However, the standby-controller is optimized for low power consumption while the main controller is optimized for high performance.