An aspect of this invention generally relates to data processing devices and more particularly to fatal interrupt handling in a multicore execution environment.
The security and reliability of an operating system within a data processing device is an increasingly important concern. System-On-Chip (SoC) and integrated circuit (IC) designs are increasingly complex, and more and more processors are integrated into SoCs/ICs to perform increasingly varying and complex functionalities. Also, multiple SoCs/ICs are commonly linked together for advanced applications. Multiple processors/SoCs/ICs can be involved in performing certain tasks, and each processor/SoC/IC can be dependent on one or more others to complete the tasks. During development for a SoC device, and potentially after release, errors in the software or hardware may cause instabilities in the operation of the SoC device. For example, the software or hardware may cause a fatal event to occur on one or more processors which will result in resetting the SoC device. Fatal events are typically triggered by access violations, bus errors, software asserts, but other software or hardware issues may trigger a fatal event.
In a multicore environment, multiple fatal events may occur simultaneously either synchronously or asynchronously. In general, hardware fatal events are handled as a Central Processing Unit (CPU) interrupt and software fatal events are handled by the CPU as software exceptions. Upon encountering a fatal error, the SoC device may be configured to dispatch a handler for the fatal event. The handler may save device information associated with the context of the SoC device. The handler may attempt to restore the context of the device prior to resetting the device. In a multicore processor, the handler may halt the processing on other CPUs and instruct them to dump their contexts. Resetting the device with no coordination with other CPUs may produce operational issues such as deadlock or incomplete diagnostic information if other CPUs are handling one or more simultaneous errors.