1. Field of the Invention
The present disclosure relates generally to packet network devices such as switches and routers, and more particularly to a virtual packet network device architecture that recovers from the failure of any single route processor module without the loss of the network device functionality.
2. Description of Related Art
Packet network devices direct data packets traveling across a network between data sources and destinations. Packet network devices can perform “routing” or “switching” depending on the header information and networking techniques used to direct the data packets and a single packet network device may be configured to perform both switching and routing. Such devices are referred to herein as a “packet switch” with the understanding that this term encompasses a wide variety of packet forwarding capabilities.
FIG. 1A is a high-level block diagram of an exemplary packet switch 100. The switch comprises some number of line cards (LC), LC1-LCn, one or more switch fabric cards (SF), and one route processor module (RPM) 110. Each line card LC receives ingress data traffic from and transmits egress data traffic over network links to peer devices through bi-directional ports. The ports can be configured for different electrical or optical media via the use of different line card types, different port interface modules, and/or different pluggable optics modules.
Continuing to refer to FIG. 1A, for most ingress packet traffic on each line card LC, a line card packet processor examines a packet, determines one or more switch egress ports for the packet, and queues the packet for transmission through the switch fabric when possible. For most egress packet traffic on each line card LC, the line card queues the packets arriving from the switch fabric, and selects packets from the queues and serves them fairly to the egress ports. Each LC includes memory that is used to store lookup tables that a packet processor accesses to determine what operations to perform on each packet, as well as the next hop destination for each packet. Each LC also includes a line card processor (LCP) which can be a general purpose processor that handles control plane operations for the line card. Control plane operations include programming lookup memory according to instructions from the RPM, programming registers on the packet processor that tailor the line card behavior, receiving control plane packets (packets addressed to switch 100, e.g., for various routing/switching protocols) from the packet processor, and transmitting control plane packets (packets generated by switch 100 for communication to a peer device) to the packet processor for forwarding out an external port. The LCP may implement some control plane functionality for some protocols handled by switch 100.
The LCP in FIG. 1A also connects to the RMP over an inter-process communication (IPC) bus. The RPM uses the IPC bus to communicate with the LCP in order to boot the line cards, monitor the health of the line card and its environmental parameters, manage power for the line card and its components, and perform basic hardware configuration for the line card. The switch fabric (SF) can be comprised of one or more modules each of which are generally identical in a system. The switch fabric (SF) provides serdes interfaces for each line card and a parallel crossbar switch that can switch any of the inputs to any number of the outputs.
The route processing module (RPM) 110 shown in FIG. 1A controls all aspects of the overall operation of the chassis. FIG. 1B illustrates the functionality of the RPM 110 of FIG. 1A in more detail. The RPM in FIG. 1B can be comprised of three processors: a control processor CP, which controls the overall operation of the switch; and two route processors RP.0, RP.1, which run different routing/switching protocols, communicate with external peers, and program the line cards to perform correct routing and switching. In this case, the CP can be dedicated to running certain management functions such as user interface management, system chassis management, system configuration management and management of system security to name only a few functions. RP.0 can be dedicated to running layer 3 routing protocols such as the border gateway protocol (BGP), the open shortest path first (OSPF) protocol, routing information protocol (RIP) to name just a few, and RP.1 can be dedicated to running layer 2 switching protocols such as the Internet group management protocol (IGMP), address resolution protocol (ARP), spanning tree protocol (STP) and the virtual router redundancy protocol (VRRP) to name just a few. The routing protocols running on RP.0 generally send messages to and receive messages from the surrounding network devices in order to learn certain information about these devices and their relationship to the network. This information can include their IP address, distance information, link attributes, group membership information to name only a few. The switching protocols running on RP.1 generally gather information from the packets being processed by the host device, which in this case is the router 100. This information can include the MAC address and the port I.D. of another network device. The information received by the protocols running on RP.0 and RP.1 can be used to derive the shortest path from the host network device to another, neighboring network device or to calculated the distance between two network devices, to calculate a next hop address for instance or spanning trees and other information used to construct and maintain layer 2 switching tables and layer 3 routing tables. The switching table and routing table information is then made available to the line card control processors which use this information to update forwarding tables which are used by the packet processors to process packets or frames of information arriving at the router 100. The processes that are employed to build and maintain routing and switching tables on the RPMs and to build and maintain lookup tables on each of the LCs will not be described here, as these processes are well known to those skilled in packet network device design. Although the RPM 110 is described above as being comprised of three processors, CP, RP.0 and RP.1, all of the functionality included in the three processors can be included in one processor or two or more processors. The number of processors employed to implement this functionality is not important.
In order to provide a higher degree of availability than the switch 100 described with reference to FIG. 1A, some packet network devices are designed to include two route processor modules. FIG. 2 illustrates such a packet network device that includes two route processor modules, RPM.1 and RPM.2. Each of the RPMs in FIG. 2 can include all of the functionality of the RPM described with reference to FIG. 1A and generally operate to control all aspects of the overall operation of the chassis. When two RPMs are present, one is designated as the master, and the other remains on standby (warm or cold standby) and only the master operates to control the functionality of the switch. The standby RPM monitors the health of the master, and takes over as master should the first fail. As described earlier with reference to FIG. 1A, each RPM comprises three processors: a control processor CP, which controls the overall operation of the switch; and two route processors RP.0, RP.1, which run different routing/switching protocols.
As described above with reference to FIGS. 1A and 2, a single packet switch/router can only support a finite number of line cards and ports. In order to provide a switching platform with a larger number of ports, some vendors have designed special link cards or a “back-end” port that can be used to connect two separate switches together to form a system that in at least some ways acts with peer devices like a single larger chassis. Such an arrangement of stacked switches is described in the background section of U.S. patent publication no. 2009/0268748. FIG. 3 shows such a stacked switch arrangement that includes two switches, S.1 and S.2 connected together to operate as a single, logical device. Typically, stacked switches operate such that one switch is designated to be the master switch and operates to run all of the layer-2 switching and layer-3 routing protocols and the other switch operates as a slave. The master device also operates to update the forwarding tables included in the other slave devices connected in the stacked arrangement. However, in the event that the master switch/device fails, the entire stacked chassis can become inoperable. In order to mitigate this problem, some vendors have designated one switch or device to be a primary master device and the other slave devices in the stack to be designated as secondary or backup master devices. Using this arrangement, in the event that the primary master device in the stack fails, the secondary master device is able to take over all of the functionality performed by the primary master prior to its failure, however, all of the functionality of the failed master device is lost.