The speed of business moves at an ever increasing rate due to the use of server computer systems. Furthermore, a down or halted server computer system may force a business to halt a critical system, which can cause large losses in productivity. Therefore, a server computer system requires high levels of reliability, availability and serviceability (RAS) features.
Typically, to enable implementation of RAS features, a server computer system needs to be reconfigurable. In many cases, RAS-related operations require changes to the system configuration such as, for example, adding memory, removing memory, adding a processor, removing a processor and recovering from failures while the operating system (OS) is running (i.e., in an OS transparent manner).
Some known server computer systems or processor systems provide an interrupt or OS cycle stealing mechanism that enables the OS to be put into a quiescent state (i.e. quiesces the OS) so that certain RAS features can be implemented (e.g., so that the system configuration can be changed) while the OS is running. In some of these known systems, the interrupt mechanism is referred to as a system management interrupt (SMI). However, due to real-time demands, the OS imposes system management interrupt (SMI) latency limitations. In other words, the OS limits the amount of time for which the OS can be held in a quiescent state to prevent or avoid compromising critical business services, OS timer tick loss, video and/or audio glitches, inter-process timeouts, etc. In addition, if errors occurred and are not detected during the calculation and update process, then the change in system configuration can not be referred back to an original state causing the system to become unstable.