The development of server chipsets and server products that are targeted at high end enterprise class server systems requires careful consideration of reliability, availability and serviceability (RAS) requirements as well as features. Such products may be intended for use as back-end servers (such as in a data center), where RAS features and requirements are as important as system performance. The ability to swap out modules in a computer system without powering down or shutting down a computer is beneficial. This “swapping” is referred to by various names, such as: hot socket, hot swap, hot addition, hot removal, hot-plug capability, etc.
Consequently, implementation of hot-plug capability within enterprise server class systems is a vital RAS feature. Hot-plug capability allows upgrades and repair of nodes within a system without bringing the system down or rebooting. As a result, the ability to hot-plug various parts of a computer system, such as processors, memory, I/O (input/output) boards, modules, etc. is beneficial for replacing defective parts, performing system upgrades and the like.
Hot-plug of CPU/memory refers to the ability to add/remove/replace a processor/memory node while the operating system (O/S) continues to run on the platform. Similarly, the hot-plug of an I/O node is the ability to add/remove/replace an I/O node consisting of multiple peripheral component interconnect (PCI) root bridges and bus segments while the O/S continues to run. Those skilled in the art will recognize that hot-plug of CPU/memory, I/O node (hot-plug of I/O node distinguished from PCI hot-plug by the fact that multiple root bridges are being hot-plugged) is a feature that is not supported by current system architectures and operating systems.
Currently, hot-plug of devices has been restricted to PCI devices. The ability to hot-plug PCI devices is provided by the PCI bus definition. Accordingly, the PCI bus definition provides two characteristics which enable the hot-plug of PCI devices. The PCI bus definition provides a mechanism for enumerating devices on a PCI bus via PCI configuration mechanisms. In addition, the PCI bus definition provides a mechanism for enumerating the resources needed by a PCI device via the PCI base address registers (BARs) in the device PCI configuration space.
As described above, hot-plugging refers to the capability of a device to be added/removed to/from a computer system while the system is powered on and running an operating system without significantly affecting the tasks currently executing on the system. Based on the PCI bus definition characteristics described above, two characteristics are required by an operating system for hot-plug of a device. First, the device must be enumerable. Second, the device resources must be enumerable.
In other words, a software mechanism is required that the O/S can use to detect when the device is hot added or detect when a device is removed. Furthermore, the resources of the device are required to be enumerable before the device decodes any of the resources (memory space, I/O, configuration IDs) that the currently-running operating system is aware of. Likewise, the hot-plug device cannot use any of the resources that the running system is using until the operating system knows what device is being hot-plugged. Once hot-plug is detected, enumeration of the resources that the hot-plugged device will be using is required.
Unfortunately, current operating systems do not support the hot-plug of processor/memory nodes. This is due to the fact that no mechanisms for the enumeration of processor memory/nodes are available/defined. In addition, standard mechanisms for enumeration of the resources required by processor/memory nodes are not available. Since the characteristics described above for enabling hot-plug of a device are not provided by processor/memory nodes, implementation of a mechanism for supporting hot-plug of processors and memory nodes would potentially require the definition of a new bus interface for enumeration of processors.
Therefore, there remains a need to overcome one or more of the limitations in the above-described existing.