In a distributed computer network comprising a system of interconnected computer nodes, information comprising commands, responses and data frequently must be transmitted between two or more nodes and combinations of nodes in order to allow the various components of the network to interact. Generally, a so-called "bus" connects the various nodes and acts as a communications conduit linking them. Obviously, if the bus fails or becomes unavailable, communications between nodes will cease.
A "port", also called an "interface" or "adapter", is the mechanism through which a (host) computer or other device gains access to a bus for communicating with other computers and devices. A port includes a port processor, port buffer, and link components; the roles of these components is explained below.
A "node" comprises a host computer and at least one port; a node may also have or use multiple ports and there ports may communicate with each other over the bus.
A "bus" is an interconnection between devices through which information may be transferred from one device to another; it includes a communication channel and associated components and control.
A "network" is a system of nodes interconnected via a common bus.
For many applications, the consequences of a bus failure are serious, as intercomputer communications are essential; a primary design goal for such networks, therefore, is high bus availability and reliability. One approach which has been adopted in the past to enhance availability and reliability of the node-to-node interconnection in such networks is to provide two fully redundant bus paths running in parallel from node to node. That way, if one path is unavailable the other can be used. Full redundancy, as that term is used herein, implies not only the duplication of cabling between system nodes, but also, at each node and for each port at each node, completely separate processing circuitry and software for handling communications over each of the paths.
Most bus failures however, are not due to unreliability of the transmit or receive processing circuitry or software in a node; but, rather, to physical breakdown of a path--i.e., cabling, connectors, etc. Secondarily, failures occur in the circuitry closest to the cabling--e.g., line drivers and receivers, etc. Consequently, it is needlessly expensive and causes needless complexity to employ full redundancy. Moreover, full redundancy requires that the selection of a path for a particular exchange be made at a point within or very close to the host-interface connection, generally requiring awareness and perhaps even requiring action by the host's software. One way of addressing the burden this creates is to assign one path primary responsiblity, and to use the other path(s) only when the primary path fails. That, however, permits failure of the back-up path(s) to go unnoticed for a long time, until such path(s) is (are) not available when needed, allowing a total failure of communication ability to occur.
Full redundancy also does nothing to protect the integrity of the logical connection between nodes in the event of bus failure.
Accordingly, it is an object of this invention to provide a high availability and high reliability computer interconnection system which is more efficient than a single bus path but less costly than a fully redundant bus system.
It is a further object of this invention to provide such a bus system in which each node contains a minimum of redundant parts dedicated to separate bus paths and a majority of parts shared by multiple bus paths, to provide higher reliability and availability without the cost of a fully redundant dual bus system.
Yet another object of this invention is to provide such an interconnection employing two (or more) bus paths wherein both the selection and operation of the bus path, and the failure of a bus path (with resulting switchover), need not be immediately visible to the host computers which communicate over the system.
Still another object of the invention is to provide a multiple path node interconnection network wherein the failure of a path during an exchange does not corrupt the logical connection between the communicating nodes.