In many server systems, a protected processor mode known as system management mode (SMM) is extensively used by firmware for error handling and for various reliability availability and serviceability (RAS) events in server systems, among other legacy events that trigger an interrupt (a system management interrupt (SMI)) to this mode. In today's high core count server systems, overreliance on SMIs leads to innumerable complex corner cases and race conditions that cause convoluted workarounds, and increases to the complexity of a processor.
Another downside of SMI-based RAS is the need to time-slice SMIs for complex RAS features, leading to arduous debug tasks and customer dissatisfaction. Apart from such architectural and engineering challenges, the current model also adversely affects runtime operating system (OS) performance, as entry into SMM stalls the OS from forward progress.