When a hardware fault is detected in a digital computer system, the condition is often signalled to the CPU (central processing unit) with an interrupt called Machine Check, or Non-Maskable Interrupt (NMI). Generally, operating system contain NMI trace routines that are designed to trace errors or faults in the system features that are known as of the most recent version of the operating system program. Unfortunately, trace routines often lag the rapid addition of new features to computer systems, so that proper diagnosis of the source of the problem is not adequately performed.
Detailed information about the failure is often captured in special hardware registers, many of which can be read by software running on the CPU. Gathering the information from these registers also has complications. For example, some systems can be designed with both commercially available industry standard hardware and some "inhouse" designed hardware. In such a system, typically some error register implementations are memory mapped, which can be accessed via predefined processor instructions, while some error register implementations are non-memory mapped, which can be accessed via JTAG (Joint Test Action Group) scan only, usually by an embedded controller. While some implementations for error information gathering utilize a single method of data gathering, for example, OCS scans with saving of all scan ring data into non-volatile RAM, these implementations tend to limit the flexibility of the system designer.
Accordingly, what is needed is a method and system for accessing all available error information captured in the system, while avoiding potential side effects, such as loss of data or hardware states if error registers are accessed concurrently.