1. Field of the Invention
This invention relates to hot swapping of components of a computer system, and more particularly to hot swapping of components of a multiprocessor computer system.
2. Background Information
It is often important to add components to a computer system, or remove components from the computer system, while the computer system is running, and without halting the computer system. Adding and/or removing components from a running computer system is referred to as xe2x80x9chot swappingxe2x80x9d. Hot swapping is particularly important in a multiprocessor computer system because critical application programs may be running on some of the processors while it is desired to replace a failed processor, add a new processor, add or remove input/output modules, etc., without interrupting the processors running critical application programs.
Some computer systems allow hot swapping of single processor cards into or out of a backplane bus. Also some computer systems allow hot swapping of input/output cards from a backplane bus, or from an input/output bus.
When many processors are connected into a multiprocessor system, the limitation of adding or removing components from a single bus is severe. For example, components of a large system may be mounted in a plurality of chassis interconnected by appropriate signaling cables. It then becomes desirable to be able to hot swap major components of such a large computer system.
There is needed a method of hot swapping major components of large multiprocessor computer systems.
The invention is a control system using microprocessors which communicate through a Local Area Network (private LAN) to control operation of both processors and input and output subsystems (IO system) of a multiprocessor computer system. The processors each have memory associated therewith, and each processor has an IO system comprising a plurality of busses such as PCI busses, associated therewith. The processors are cabled together in a mesh arrangement so that messages can be transferred between any of the processors and delivered to memory associated with the destination processor, or delivered to an IO system associated with the destination processor, etc.
At the highest level of granularity, processors and input/output subsystems are mounted in a rack. For example, a rack may hold from one (1) to eight (8) processors and an input/output subsystem associated with each processor. A rack may be hot swapped, that is a rack may be added to or removed from the multiprocessor computer system by interrupting the cabling between racks and either inserting a new rack or removing an old rack, without interrupting operation of processors mounted in other racks of the multiprocessor computer system. Hot swapping of racks is accomplished by the microprocessor control system detecting changes in the cabling, detecting removal of an old processor, and detecting addition of new processors, and the microprocessor control system computing new routing tables for messages through the new cabling arrangement.
At a more detailed level of granularity, a single processor board, holding two processors in a preferred embodiment of the invention, may be inserted into a backplane bus of a rack, or removed from the backplane bus. Again, the microprocessor control system detects the change in the multiprocessor computer system and computes new routes for transfer of messages in the new configuration.
Further, the input/output modules have a plurality of IO busses, and the IO busses accept a variety of input/output cards such as disk controller cards, network interface cards, etc. Any of these cards may be hot swapped in or out of their respective busses. Also entire input/output modules may be hot swapped by being added or removed from the multiprocessor computer system. The microprocessor control system detects the addition or removal of the input/output cards, or the addition or removal of entire input/output modules, and in response compute new routes for signaling between processors and the input/output devices.
The microprocessor control system operates by having a database giving the particulars of the multiprocessor computer system, each microprocessor has associated microprocessor memory, and a complete copy of the database is maintained in each microprocessor memory. In a preferred embodiment of the invention, one microprocessor controlling processors is mounted in each rack and so controls up to eight (8) processors, and one microprocessor is mounted in each input/output module for controlling that input/output module. All of the microprocessors communicate through the private LAN. Each of the microprocessors maintains a current version of the database in its associated memory through exchanging control messages through the private LAN. Transfer of control messages through this private LAN is the mechanism by which the microprocessors keep track of the multiprocessor computer system, apply power to processors, boot-up processors, halt processors, etc.
Further, after the multiprocessor system is booted and operating, a microprocessor may be hot swapped from its mounting in a rack or an input/output module. Normal operation of the operating system of the processors does not require intervention by the microprocessors. Thus a microprocessor may be disconnected from its mounting, disconnected from the private LAN, and a new microprocessor substituted in its place.