Data communication between components of a computer system can be provided in a number of ways. In many large storage systems, for example, the interconnection solutions may be based on bus architectures such as the small computer system interconnect (SCSI) or fibre channel (FC) standards. In these architectures, multiple storage devices such as hard disk drives may share a single set of wires or a loop of wires for data transfers.
Such architectures may be limited in terms of performance and fault tolerance. Since all of the devices share a common set of wires, only one data transfer may take place at any given time, regardless of whether the devices have data ready for transfer. Also, if a storage device fails, it may be possible for that device to render the remaining devices inaccessible by corrupting the bus. Additionally, in systems that use a single controller on each bus, a controller failure may leave all of the devices on its bus inaccessible. In a large storage array, component failures can occur with significant frequency. As the number of components in a system is increased, the probability that any one component will fail at any given time increases, and, accordingly, the mean time between failures (MTBF) for the system is decreased.
It may be desirable to minimize the effect of errors on data transmission between components and to improve performance by routing messages over an interconnection fabric. Multi-path interconnection fabrics can provide failover capabilities to networks. In an interconnection fabric including a network of nodes, multiple independent paths between any two nodes in the network can be possible. After a message is initiated by a source node, the message may pass through multiple intermediate nodes before ultimately reaching the destination node.
Various methods are known for controlling the path which the message follows to reach the destination node. According to some methods, the entire path is predetermined by the source node and is stored in the message. Each intermediate node need only follow the routing instructions provided in the message. This type of routing method can have several disadvantages. First, this method typically requires that the source node have a detailed understanding of the network topology, as well as the network's current loads and conditions. Even with such an understanding of the network, this type of routing method may be ineffective at adapting to failures or delays which occur in the predetermined path after the message has been transmitted from the source node. In addition, storing the detailed routing instructions within the message may cause the size of the message to increase to an undesirable level.
According to various adaptive routing schemes, each message contains the destination address, but no detailed routing instructions. At each intermediate node encountered by the message along its route, the destination address is read and this information is used by the intermediate node to compute the optimum path for onward transmission. Therefore, when a message is injected into the network, each intermediate node selects the next node to which to transmit the message, taking into account node failures, congestion and other conditions. This type of intelligent routing may be useful in some situations, but typically requires that each intermediate node be provided with the capability to monitor the node conditions and make appropriate decisions in response thereto. This can dramatically increase the cost of the system, particularly when a large number of nodes are utilized. In addition, because a message is typically processed by multiple nodes with each node making a determination as to the optimum path, a message between a particular source and destination node may take a different path every time it is injected. In some situations, it may be desirable to have a more predictable routing methodology.