Data communication in a computer network involves the exchange of data between two or more entities interconnected by communication links and subnetworks. These entities are typically software programs executing on hardware computer platforms, which, depending on their roles within the network, may serve as end stations or intermediate stations. Examples of intermediate stations include routers, bridges and switches that interconnect communication links and subnetworks; an end station may be a computer located on one of the subnetworks. More generally, an end station connotes a source of or target for data that typically does not provide routing or other services to other computers on the network. A local area network (LAN) is an example of a subnetwork that provides relatively short-distance communication among the interconnected stations; in contrast, a wide area network (WAN) facilitates long-distance communication over links provided by public or private telecommunications facilities.
End stations typically communicate by exchanging discrete packets or frames of data according to predefined protocols. In this context, a protocol represents a set of rules defining how the stations interact with each other to transfer data. Such interaction is simple within a LAN, since these are typically "multicast" networks: when a source station transmits a frame over the LAN, it reaches all stations on that LAN. If the intended recipient of the frame is connected to another LAN, the frame is passed over a routing device to that other LAN. Collectively, these hardware and software components comprise a communications network and their interconnections are defined by an underlying architecture.
Most computer network architectures are organized as a series of hardware and software levels or "layers" within each station. These layers interact to format data for transfer between, e.g., a source station and a destination station communicating over the network. Specifically, predetermined services are performed on the data as it passes through each layer, and the layers communicate with each other by means of the predefined protocols. This design permits each layer to offer selected services to other layers using a standardized interface that shields the other layers from the details of actual implementation of the services.
The lower layers of these architectures are generally standardized and implemented in hardware and firmware, whereas the higher layers are usually implemented in the form of software. Examples of such communications architectures include the Systems Network Architecture (SNA) developed by International Business Machines Corporation, the Internet communications architecture, and the Open Systems Interconnection (OSI) reference model proposed by the International Standards Organization.
The Internet architecture is represented by four layers termed, in ascending interfacing order, the network interface, internetwork, transport and application layers. These layers are arranged to form a protocol stack in each communicating station of the network. FIG. 1 illustrates the manner in which a pair of Internet protocol stacks 125, 175 transmit data between a source station 110 and a destination station 150 of a network 100. The stacks 125, 175 are physically connected through a communication channel 180 at the network interface layers 120 and 160. For exemplary purposes, the following discussion focuses on protocol stack 125.
In general, the lower layers of the protocol stack provide internetworking services and the upper layers, which utilize these services, collectively provide common network application services. The application layer 112 contains a variety of functions commonly needed by software processes executing on the station, while the lower network interface layer 120 of the Internet architecture implements industry standards defining a flexible network architecture oriented to the implementation of LANs.
Specifically, network interface layer 120 comprises physical and data link sublayers. The physical layer 126 controls the actual transmission of signals across the communication channel 180, defining the types of cabling, plugs and connectors used in connection with the channel. The data link layer, on the other hand, is responsible for transmission of data from one station to another and may be further divided into two sublayers: logical link control (LLC) 122 and media access control (MAC) 124. The MAC sublayer 124 is primarily concerned with controlling access to the transmission medium in an orderly manner and, to that end, defines procedures by which the stations must abide in order to share the medium. The LLC sublayer 122 manages communications between devices over a single link of the network and provides for environments that require connectionless or connection-oriented services at the data link layer.
Connection-oriented services at the data link layer generally involve three distinct phases: connection establishment, data transfer and connection termination. During connection establishment, a single path or connection, e.g., an IEEE 802.2 LLC Type 2 or "Data Link Control" (DLC) connection is established between the source and destination stations. Data is then transferred sequentially over the path and, when the DLC connection is no longer needed, the path is terminated. The details of connection establishment and termination are well-known and require no elaboration.
The transport layer 114 and the internetwork layer 116 provide predefined sets of services to aid in connecting the source station to the destination station when establishing application-to-application communication "sessions." The primary network-layer protocol of the Internet architecture is the Internet Protocol (IP), which is contained within the internetwork layer 116. IP is primarily a connectionless network protocol that provides for internetwork routing, fragmentation and reassembly of exchanged frames--usually called "datagrams" in an Internet environment--and which relies on transport protocols for end-to-end reliability. An example of such a transport protocol is the Transmission Control Protocol (TCP), which is contained within the transport layer 114 and provides connection-oriented services to the upper layer protocols of the Internet architecture. The term TCP/IP is commonly used to denote this architecture.
Data transmission over the network 100 therefore consists of generating data in, e.g., a sending process 104 executing on the source station 110, passing that data to the application layer 112 and down through the layers of the protocol stack 125, where the data are sequentially formatted as a frame for delivery onto the channel 180 as bits. Those frame bits are then transmitted over an established connection of channel 180 to the protocol stack 175 of the destination station 150, where they are passed up that stack to a receiving process 174. Data flow is schematically illustrated by solid arrows in FIG. 1.
Although actual data transmission occurs vertically through the stacks, each layer is programmed as though such transmission were horizontal. That is, each layer in the source station 110 is configured to transmit data to its corresponding layer in the destination station 150, as schematically shown by the dashed arrows in FIG. 1. To achieve this effect, each layer of the protocol stack 125 in the source station 110 typically adds information (in the form of a header field) to the data frame generated by the sending process as the frame descends the stack; the header-containing frame is said to be "encapsulated." At the destination station 150, the various headers are stripped off one-by-one as the frame propagates up the layers of the stack 175 until it arrives at the receiving process.
In a typical mainframe-oriented network architecture, applications executing on end stations typically access the network through logical units (LUs), i.e., sets of logical services facilitating communication; accordingly, in such a network, a communication session connects two LUs in a LU--LU session. Activation and deactivation of such a session may be accomplished by Advanced Peer-to-Peer Networking (APPN) functions.
The APPN functions generally include session establishment and session routing within an APPN network. FIG. 2 shows a conventional APPN network 200 comprising two end stations 202, 212, which are typically configured as end nodes (ENs), coupled to token ring (TR) subnetworks 204, 214, respectively. During session establishment, an EN (such as EN 202) requests an optimum route for a session between two LUs, i.e., its own and that of the destination station; this route is calculated and conveyed to EN 202.
Intermediate session routing occurs when the intermediate stations 206, 216, configured as APPN network nodes (NNs), are present in a session between the two end nodes. The APPN network nodes are further interconnected by a WAN 210 that extends the APPN architecture throughout the network. The APPN network nodes forward packets of a LU--LU session over the calculated route between the two APPN end nodes. An APPN network node is a full-functioning APPN router having all APPN base service capabilities, including session-services functions. An APPN end node, on the other hand, is capable of performing only a subset of the functions provided by an APPN network node. APPN network and end nodes are well-known and are described in detail in, for example, Systems Network Architecture Advanced Peer to Peer Networking Architecture Reference (IBM Doc. SC30-3422) and J. Nilhausen, APPN Networks (1994).
FIG. 3 illustrates the software architecture of a conventional APPN node 300. An application 302 executing on an APPN node acting as an end node (in the manner of, e.g., EN 202 of network 200) communicates with another end node (e.g., EN 212) through a LU--LU session; the LU 304 within each end node functions as both a logical port for the application to access the network and as an end point of the communication session. The data exchange comprising the session generally passes through a path control module 312 and a data link control (DLC) module 316 of the node, the latter of which connects to various network transmission media.
When the APPN node 300 functions as an APPN router node (in the manner of, e.g., NN 206), an intermediate session routing (ISR) module 305 maintains a portion of the session in each "direction" with respect to an adjacent network node (e.g., NN 216 of network 200). During session establishment, path control 312 and ISR 305 are invoked to allocate resources for the session. With reference to FIG. 2, each NN 206, 216 allocates a local form session identifier for each direction of the session. Collectively, these individually established "local" sessions form the logical communication session between the LUs of the end nodes 202, 212.
When initiating a session, the application 302 specifies a mode name that is distributed to all APPN network nodes; the LU 304 in each node uses the mode name to determine the set of required characteristics for the session being established. Specifically, the mode name is used by the control point (CP) module 308 of each APPN node 300 to find a corresponding class of service (COS) as defined in a COS table 310. The CP coordinates performance of all APPN functions within the node, including management of the COS table 310. The COS definition in table 310 includes a priority level specified by transmission priority (TP) information 320 for the packets transferred over the session; as a result, each APPN network node is apprised of the priority associated with the packets of a LU--LU session. The SNA architecture specifies four TP levels: network priority, high priority, medium priority and low priority. Path control 312 maintains a plurality of queues 314, one for each TP level, for transmitting packets onto the transmission media via DLC 316.
Data link switching (DLSw) is a forwarding mechanism over an IP backbone network, such as the Internet. In traditional bridging, the data link connection is end-to-end, i.e., effectively continuous between communicating end stations; a frame originating on a source LAN traverses one or more bridges specified in the path over the LLC connection to the destination LAN. In a system implementing DLSw, by contrast, the LLC connection terminates at the first DLSw bridge or router. The DLSw device multiplexes the LLC connections onto a transport connection to another DLSw bridge or router. In this way, the individual LLC connections do not cross a wide-area network, work, thereby reducing traffic across this network; the LLC connections from the source LAN to the transmitting data link switch, and from the receiving data link switch to the destination LAN, are entirely independent from one another. Data link switching may be implemented on multi-protocol routers capable of handling DLSw as well as conventional (e.g., source-route bridging) frames. The DLSw forwarding mechanism is well-known and described in detail in Wells et al., Request for Comment (RFC) 1795 (1995).
In particular, a heterogeneous DLSw network is formed when two data link switches interconnect end nodes of the APPN network by way of the IP network; the switches preferably communicate using a switch-to-switch protocol (SSP) that provides packet "bridging" operations at the LLC (i.e., DLC) protocol layer. FIG. 4 illustrates a conventional DLSw network 400 comprising two exemplary DLSw switches 406, 416 interconnecting the ENs 402, 412 via an IP network 410. In accordance with the DLSw scheme, a lower-layer DLC connection is established between each EN 402, 412 and the corresponding data link switch 406, 416, where the connection terminates. In order to provide a complete end-to-end connection between the end nodes, the DLC connections are carried over a reliable, higher-layer transport mechanism, such as TCP sessions. Data link switches can establish multiple, parallel TCP sessions using known port numbers; all packets associated with a particular DLC connection typically follow a single, designated TCP session. Accordingly, data frames originating at a sending EN 402 are transmitted over a particular DLC connection along TR 404 to switch 406, where they are encapsulated within a designated TCP session as packets and transported over IP network 410. The packets are received by switch 416, decapsulated to their original frames, and transmitted over a corresponding DLC connection of TR 414 to EN 412 in the order received from EN 402 by switch 406. The communicating switches 406, 416 are usually called "peers."
Each data link switch typically maintains a list of DLSw-capable routers (i.e., routers capable of acting as, and interacting with, data link switches). After the TCP connection is established, SSP messages are exchanged to establish the capabilities of the two communicating switches. Once this "capabilities exchange" is complete, the switches employ SSP messages to establish end-to-end circuits over the transport connection, and thereafter to exchange data.
In FIG. 4, end nodes 402, 412 are equivalent APPN nodes. This arrangement, while typical, is not universal. Some networks are organized hierarchically to accommodate different classes of station. For example, a data server such as a credit-card authorization center may maintain a very large database of information for access by many (perhaps thousands) of remote end stations. In such circumstances, the DLSw peers are segmented into two classes: remote peers for connecting to remote end stations, and data-center peers for aggregating data traffic at the data center. Thus, a hierarchical network of this type will usually have a large number of remote peers and a relatively small number of data-center peers.
Each data-center peer can simultaneously accommodate (i.e., exchange data with) only a limited number of remote peers. Furthermore, because of the need for high reliability and constant availability of access to the data center by the remote end stations, the network usually includes backup data-center peers (in addition to the minimum number of "primary" data-center peers necessary to routinely handle all the remote peers). In this way, if one of the primary peers fails, or is nearing capacity and exhibiting diminished throughput, the backup peer can be activated to handle the traffic. Peer redundancy improves network stability and prevents excessive switching times.
One method of accommodating backup peers is for a backup peer to be assigned to each primary peer, with each remote peer maintaining a simultaneous connection to the data-center primary peer and its designated backup peer. If the primary peer fails, traffic continues through the backup peer. This arrangement, of course, is expensive in requiring a dedicated backup peer for each primary peer, and wasteful in that the backup peer may be unneeded much (if not most) of the time. Moreover, if both the primary and the backup are for some reason unavailable (e.g., the primary is overloaded and the backup has failed), the connection from the remote peer will be refused, and the remote end station will effectively be excluded from access to the data center.
Efforts to use a single data-center peer as a backup for multiple primary peers do not necessarily reduce overall costs. Even with this type of arrangement, each backup peer must have sufficient memory and computational capacity to handle the load should all primary peers to which it is assigned simultaneously fail. Moreover, assuming constantly active backup peers (so that each remote peer constantly maintains a primary and a backuppeer), the bandwidth load is unchanged regardless of the number of backup peers handling the traffic.
Backup arrangements can also be envisioned that do not require simultaneous two-way transport connections between a remote peer and both a primary and a backup data-center peer. For example, the backup can be configured to accept the circuit upon detection of a failure condition in the primary. But even this approach wastes bandwidth and computational resources, since the backup must effectively be in constant communication with the primary in order to facilitate seamless transition to the new circuit; for example, the backup must be able to pick up the frame count for each LLC session acquired from the failed primary. The net effect may be little different from simultaneously active peer-to-peer connections.