1. Field of the Invention
The present invention relates to fault tolerance in computer systems, and more particularly to a method for swapping, removing or adding processors in a computer system while the computer system continues operating.
2. Related Art
Continuous operation and high reliability are essential for some computer systems. A failure, or even a temporary cessation of operation, can have catastrophic consequences for electronic fund transfer systems or airline traffic control systems, for example. To this end people have developed fault-tolerant computing systems that allow "hot swapping" of computer system components. Hot swapping involves removing and replacing a failed computer system component while the computer system continues to operate. This potentially allows a computer system with a failed component to be repaired without shutting the computer system down.
Hot swapping is typically applied to devices that plug into a computer system's peripheral bus, such as a disk drive. This allows peripheral devices to be replaced, without shutting the computer system down. However, more centrally located components, such central processing units (CPUs) cannot be replaced in this way. This is because most computer systems are uniprocessor systems with only one central processing unit. Hence, removing the central processing unit will prevent the computer system from functioning. Furthermore, CPUs are typically deeply integrated into the motherboard, or center of a computer system, and cannot easily be removed. Additionally, CPUs are harder to initialize, and are more tightly bound into the computer system's operating system and interrupt structure than are peripheral devices, such as disk drives. Consequently, it is a much harder to facilitate removal and re-insertion of a CPU in an operating computer system.
Consequently, when central processing units fail or need to be upgraded for additional performance, a computer system must be shut down to replace the CPU. Furthermore, in order to restart the computer system a lengthy rebooting process is typically required to re-initialize the operating system and other computer system components.
What is needed is a computer system that allows a CPU to be removed without shutting the computer system down.
Additionally, what is needed is a computer system that allows a CPU to be inserted and initialized while the computer system is operating.