Computers and other devices are commonly interconnected to facilitate communication among one another using any one of a number of available standard network architectures and any one of several corresponding and compatible network protocols. The nature of standard architectures and their topologies is typically dictated at the first two layers of the OSI (Open Systems Interconnection) Basic Reference Model for networks, which are the physical layer (layer 1) and the data link layer (layer 2). One of the most commonly employed of such standard architectures is the Ethernet® network architecture. Other types of network architectures that are less widely used include ARCnet, Token Ring and FDDI. Variations of the Ethernet® standard are differentiated from one another based on characteristics such as maximum throughput (i.e. the highest data transmission rate) of devices coupled to the network, the type of medium used for physically interconnecting the devices (e.g. coaxial cable, twisted pair cable, optical fibers, etc.) and the maximum permissible length of the medium.
The 10Base-T and 100Base-T Ethernet® standards, for example, designate a maximum throughput of 10 and 100 Megabits per second respectively, and are coupled to the network over twisted pair cable. The 1000Base-T (or Gigabit) Ethernet® standard designates a maximum throughput of 1000 Mbps (i.e. a Gigabit per second) over twisted pair cable. Recent advancement in the speed of integrated circuits has facilitated the development of even faster variations of the Ethernet® network architecture, such as one operating at 10 Gigabits per second (10 Gbps) and for which the transmission medium is typically optical fibers. Of course, the greater the throughput, the more expensive the network resources required to sustain that throughput. Ethernet® is a registered trademark of Xerox Corporation.
Packet switched network protocols are commonly employed with a number of architectures such as the Ethernet® standard. These protocols are typically defined by layers 3 and 4 of the OSI and dictate the manner in which data to be transmitted between devices coupled to the network are formatted into packets for transmission. These protocols are independent of the architecture and topology by virtue of their separation as hierarchical layers of the OSI. Examples of such protocols include Transmission Control Protocol/Internet Protocol (TCP/IP), the Internet Protocol eXchange (IPX), NetBEUI and the like. NetBEUI is short for NetBIOS Enhanced User Interface, and is an enhanced version of the NetBIOS protocol used by network operating systems such as LAN Manager, LAN Server, Windows® for Workgroups, Windows®95 and Windows NT®. Windows® and Windows NT® are registered trademarks of Microsoft Corporation. NetBEUI was originally designed by IBM for IBM's LAN Manager Server and later extended by Microsoft and Novell. TCP/IP is typically used in Internet applications, or in intranet applications such as a local area network (LAN). The data packets received through a network resource of the destination device are processed in reverse according to the selected protocol to reassemble the payload data contained within the received packets. In this manner, computers and other devices can share information in accordance with these higher level protocols over the common network.
One of the most basic and widely implemented network types is the Local Area Network (LAN). In its simplest form, a LAN is a number of devices (e.g. computers, printers and other specialized peripherals) connected to one another by some form of signal transmission medium such as coaxial cable to facilitate direct peer-to-peer communication there between. A common network paradigm, often employed in LANs as well as other networks, is known as the client/server paradigm. This paradigm involves coupling one or more large computers (typically having very advanced processing and storage capabilities) known as servers to a number of smaller computers (such as desktops or workstations) and other peripheral devices shared by the computers known as clients. The clients send requests over the network to the one or more servers to facilitate centralized information storage and retrieval through programs such as database management and application programs stored on the server(s). Servers may also be used to provide centralized access to other networks and various other services as are known to those of skill in the art. The servers provide responses over the network to the clients in response to their requests. Clients and/or servers can also share access to peripheral resources, such as printers, scanners, and the like over the network.
LANs are sometimes coupled together to form even larger networks, such as wide area networks (WANs), or they may be coupled to the Internet. LANs may also be segmented into logical sub-networks called segments. This can be accomplished through the use of multiple switches that do not communicate with one another (i.e. they are noncontiguous) or through the creation of virtual LANs (VLANs). The isolation between VLANs and a particular network device's access to the segments are controlled by a switch that can be programmed in real time to couple network resources of that device to one, some or all of the VLAN segments.
For a given network architecture such as Ethernet®, various network topologies may be implemented. A network topology simply defines the manner in which the various network devices are physically interconnected. For example, the simplest topology for an Ethernet® LAN is a bus network. A bus network couples all of the devices to the same transmission medium (e.g. cable, optical fiber, etc.). One manner in which this is commonly accomplished is through use of a T-connector and two cables to connect one device to T-connectors coupled to each of its two neighbors on the network. The problem with the bus network approach is that if the interface for one of the devices fails or if one of the devices is removed from the network, the network bus must be reconnected to bypass the missing or malfunctioning device or the network is broken.
A better approach is to use a star topology, where all of the network devices are coupled together through a device such as a concentrator. A concentrator acts to consolidate all of the network connections to a single point, and is able to combine signals received from slower devices to communicate with a device capable of supporting a higher throughput. Thus, requests coming from several clients may be combined and sent to a server if the server has the ability to handle the higher data rate of the combined signals. Each of the network devices is coupled through one connector to the concentrator, and if any one of the devices is removed from the network, the other devices can continue to communicate with one another over the network without interruption.
Another topology that may be used when higher bandwidth is desired is a hub network. A hub network is similar to the bus network described above in that it involves a single connective medium through which a number of devices are interconnected. The difference is that for a hub network, the devices coupled to the single connector are hub devices rather than single network devices. Each hub device can couple a large number of network devices to the single connector. The single connector, called a backbone or core switch, can be designed to have a very high bandwidth sufficient to handle the confluence of data from all of the hubs.
Network interface resources are required to couple computers and other devices to a network. These interface resources are sometimes referred to as network adapter cards or network interface cards (NICs), each adapter card or NIC having at least one port through which a physical link is provided between the network transmission medium and the processing resources of the network device. Data is communicated (as packets in the case of packet switched networks) from the processing resources of one network device to the other. The data is transmitted and received through these interface resources and over the media used to physically couple the devices together. Adapter cards or NICs are commercially available that are designed to support one or more variations of standard architectures and known topologies.
Each of the network devices typically includes a bus system through which the processing resources of the network devices may be coupled to the NICs. The bus system is usually coupled to the pins of edge connectors defining sockets for expansion slots. The NICs are coupled to the bus system of the network device by plugging the NIC into the edge connector of the expansion slot. In this way, the processing resources of the network devices are in communication with any NICs or network adapter cards that are plugged into the expansion slots of that network device. As previously mentioned, each NIC or network adapter must be designed in accordance with the standards by which the network architecture and topology are defined to provide appropriate signal levels and impedances (i.e. the physical layer) to the network. This of course includes an appropriate physical connector for interfacing the NIC to the physical transmission medium employed for the network (e.g. coaxial cable, twisted-pair cable, fiber optic cable, etc.).
It is desirable that certain connections (e.g. access by clients to network server(s)) be as reliable as possible. It is also desirable that some network devices (e.g. network server(s)) be able to receive and respond to numerous incoming requests from other devices on the network (such as clients) as quickly as possible. As processing speed continues to increase and memory access time continues to decrease for a network device such as a server, the bottleneck for device throughput becomes pronounced at the interface to the network. While network architectures and associated network adapters are being designed to handle ever-increasing throughput rates, the price for implementing interface resources supporting the highest available throughput is not always cost-effective.
In light of the foregoing, it has become common to improve the reliability and throughput of a network by coupling some or all of the network devices to the network through redundant network resources. These redundant links to the network may be provided as a plurality of single-port NICs, one or more NICs each having more than one port or a combination thereof. Teaming of network interface resources is particularly common for servers, as the demand for throughput and reliability is typically greatest for servers on a network. Resource teams are typically two or more NICs (actually two or more NIC ports) logically coupled in parallel to appear as a single virtual network adapter to the other devices on the network. These resource teams can provide aggregated throughput of data transmitted to and from the network device employing the team and/or fault tolerance (i.e. resource redundancy to increase reliability).
Fault tolerant teams of network resources commonly employ two or more network adapter or NIC ports, one port being “active” and configured to operate as the “primary,” while each of the other members of the team are designated as “secondary” and are configured to operate in a “standby” mode. A NIC or NIC port in standby mode remains largely idle (it is typically only active to the limited extent necessary to respond to system test inquiries to indicate to the team that it is still operational) until activated to replace the primary adapter when it has failed. In this way, interruption of a network connection to a critical server may be avoided notwithstanding the existence of a failed network adapter card or port.
Load-balancing teams of network resources combine one or more additional network adapters or NICs to increase the aggregate throughput of data traffic between the network and the device. In the case of “transmit” load balancing (TLB) teams, throughput is aggregated for data transmitted from the device to the network. The team member configured to operate as primary, however, handles all of the data received by the team. In the case of “switch-assisted” load balancing (SLB) teams, throughput is balanced over all team members for data transmitted to the network as in TLB teams as well as data received by the team from the network. Typically, the received data is balanced with the support of a switch that is capable of performing load balancing of data destined for the team.
Load-balancing teams employ various algorithms by which network traffic through the team is balanced between the two or more network adapter cards, with transmit load-balancing algorithms usually residing in the transmitting network device, and the receive data load-balancing algorithm residing in the switch to which the team is coupled. Load-balancing teams inherently provide fault tolerance, but most commonly at a lower aggregate throughput than the fully functional team. Employing multiple network resources in tandem can enable a server to meet increasing demands for throughput where one NIC or NIC port would have become saturated (i.e. reached its maximum throughput) without meeting all of the demand. This can happen at a server NIC or NIC port, for example, as more client computers are added to a growing network or as processing capability of existing clients is upgraded, leading to an increase in the rate of client requests and responses to and from the server.
Certain configurations for NFT and TLB teams are designed to achieve switch redundancy in a network. This means that one or more NICs in a team are attached to two or more switches. A NIC team that is attached to a network must still have all members of the team belong to the same broadcast domain (i.e. same layer 2 network). In other words, all NICs have to be able to see each other's broadcasts. This is required so that the team knows that all team members can communicate with the same set of clients. Thus, these switch-redundant configurations require that the switches ultimately be interconnected in some way—either directly or by way of uplinks to a third switch (e.g. a backbone or core switch).
In a switch redundant configuration as described above, each path of the contiguous layer 2 network segment has at least one switch that serves a different group of clients or other network devices. If one of the switches fails, then the team will fail over to (i.e. assign as a new primary) one of the other NIC members still attached to a functioning switch. It is possible, however, for this type of configuration to suffer a failure in an uplink to the core switch rather than a switch itself. In this case, team members can become isolated on newly created LAN segments that are no longer contiguous with the switch path coupled to the current primary member of the team. If the team becomes split between two or more different network segments as the result of such a failure, the clients on the isolated network segments (the ones to which the primary is not coupled) will no longer be able to communicate with the team. This is because an NFT and a TLB team receive data for the entire team only through the primary member (for the NFT team, the primary transmits data for the entire team as well). Because there is only one primary member per team, only those paths still contiguous with the path coupled to the primary team member will have communication with the team and therefore the server.
If no redundant connection is available between switches in the isolated paths by which to bypass the fault in the connection to the core switch, the clients on the isolated path(s) will be isolated and lose connectivity to the computer system and possibly the core network. If the failure occurs in the primary path, the core switch itself becomes isolated from the computer system as well as all of the non-primary paths. In the past the only way connectivity could be restored was through physical intervention by a user to repair the fault in the connection. There was no automated recovery process by which connectivity to the server could be restored until the fault in the uplink was repaired.