1. Field of the Invention
The present invention is directed to redundant buses in a computer system and, more particularly, to the operation of switching between such buses when an error is detected on one of the buses.
2. Description of the Related Art
In highly reliable computer systems, redundant components are used to enable operation to continue even if one component fails. In the case of distributed processing systems, many processors are connected via a bus structure to operate faster with increased reliability. To further improve the reliability of such systems, redundant buses are used, typically having one bus in an active state with all communication on the active bus and another bus on standby in an inactive state.
When a fault is detected on the active bus, it is necessary to switch operation to the inactive bus and indicate the failure of the active bus so that repairs can be made. In a computer system having a master processor, switching from one bus to another is performed under the control of the master processor. However, use of a master processor leads to reduced redundancy (system operation is severely affected if the master processor fails) or problems associated with passing control from one processor currently acting as the master processor to another processor. In a distributed processing system in which all processors are independent, coordinating a switch from an active bus to an inactive bus is not as simple. It is important to avoid having a single point of failure lock up the system or to have the processors constantly switching back and forth between buses. Furthermore, typically only one bus is permitted to be active at a time and if some processors are using one bus while other processors are using another bus, all of the processors cannot communicate with each other.