1. Field of the Invention
The present invention relates to data communications systems and methods, and more particularly, to bus bridge systems and methods.
2. Statement of the Problem
High-bandwidth busses are typically used to communicate between hosts and peripherals in applications such as computer networks. The bus interfaces used by hosts and peripherals often take different forms depending on the performance characteristics desired. For example, host devices may communicate via a differential or single-ended Small Computer System Interface (SCSI) or a Fibre Channel (FC) interface, while a peripheral such as a disk array may utilize a SCSI or other bus interface. When hosts and peripherals use disparate bus architectures, bus bridges are often utilized to provide connectivity.
Bus bridges may also be used to increase the capacity of bus systems. Bus specifications often limit, among other things, the length of the bus and the number of devices that may be attached to the bus in order to maintain performance. For example, the Peripheral Component Interconnect (PCI) bus specification commonly employed in personal computer bus applications has detailed rules for round trip propagation delay and capacitive loading which help maintain the integrity of communications at specified bus clock rates. In order to increase the capacity of such a bus, an expanded multi-layer bus structure may be used that includes a plurality of busses connected by high-speed bus bridges. This multi-layer structure can allow an increased number of devices to be interconnected while maintaining bus performance.
Complex computer systems and networks may employ multiple hosts connected to peripherals such as mass storage devices. These devices often are connected to the hosts by multiple busses and bus bridges. Consequently, data stored on these mass storage systems may be temporarily inaccessible due to a bus bridge failure, an event that can incur significant down time costs. In addition, systems that utilize bridges with storage elements, such as caches used in for Redundant Array of Independent Disk (RAID) systems that implement data striping or mirroring across multiple disks or other storage media, may be subject to data loss or corruption if the coherence of the cache is lost due to a bridge failure. Accordingly, it is desirable to increase the reliability of bus bridges to help reduce the likelihood of information loss.
Conventional techniques for improving bus bridge reliability include using bus bridge systems with redundant bus bridges between busses. In one type of conventional system, a host monitors a bus bridge to determine its health by using messages communicated over the data path connecting the host and the bridge. If the host receives a message indicating failure of the bridge, the host may route information originally intended for the failed bridge through a redundant bridge, providing what is often referred to as host-managed "failover" operation.
Host-managed failover can have many disadvantages, however. Host-managed systems tend to be operating system dependent. The reliability of a host-manage failover approach may also be compromised by relatively high failure rate elements, such as the host and data paths used to monitor and control the bus bridges, the failure of which can cause a complete failure of the data path through the bus bridge system. Maintaining cache coherency in host-managed systems may also undermine performance, as caching at the host level may require a high-bandwidth communications channel between hosts. Maintaining a host-based failover capability in the presence of potential host power supply failures may also be expensive, as an entire host computer may have to be maintained through a power outage event. Accordingly, there is a need for bus bridge systems and methods that can provide improved performance, reliability and data protection.