As known in the art, a “stackable switch” is a network switch that can operate independently as a standalone device or in concert with one or more other stackable switches in a “stack” or “stacking system.” FIG. 1A depicts the front face of an exemplary stackable switch 100. As shown, the front face includes a set of data ports 102 (denoted by the letter “D”), a set of stacking ports 104 (denoted by the letter “S”), and an out-of-band management port 106 (denoted by the letter “M”). Data ports 102 are operable for connecting stackable switch 100 to one or more hosts and/or networks. Stacking ports 104 are operable for connecting stackable switch 100 to other stackable switches in the same stacking system for the purpose of forming a larger, single logical switch comprising physical stackable switch 100 and the other physical switches. Stacking ports 104 can be dedicated ports (i.e., ports designed specifically for stacking) or high bandwidth data uplink ports that operate in a stacking mode. Out-of-band management port 106 is operable for connecting stackable switch 100 to a separate terminal device, such as a laptop or desktop computer. Once connected, an administrator can use the terminal device to access the management console of stackable switch 100 and perform various switch management functions.
FIG. 1B depicts certain internal components of stackable switch 100 of FIG. 1A. These internal components include a CPU complex 152 and a management function 150 for managing the operation of stackable switch 100. CPU complex 152 can include a general purpose processor, such as a PowerPC, Intel, AMD, or ARM-based CPU, that operates under the control of software stored in an associated memory (not shown). CPU complex 152 can also include other control and logic components, such as I/O interfaces (e.g., Ethernet), temperature sensors, a real-time clock, glue logic, memory, and so on. Management function 150, which can correspond to a subset of the components in CPU complex 152 configured to performed out-of-band management, is communicatively coupled with out-of-band management port 106. The internal components of stackable switch 100 further include a packet processing complex 155 which provides both a stacking function 154 and a data port function 156. Stacking function 154 (in conjunction with switch application software 153 running on CPU complex 152) provides the stacking functionality of stackable switch 100, while data port function 156 (in conjunction with switch application software 153) enables switch 100 to send and receive data traffic via data ports 102 and stacking ports 104. For example, data port function 156 and stacking function 154 can make wire-speed decisions on how to handle data packets flowing into or out of ports 102 and 104.
Generally speaking, the physical form factor of stackable switches such as switch 100 of FIGS. 1A and 1B is fixed—in other words, each stackable switch cannot be individually upgraded with, e.g., additional data port functions, additional management functions, or the like in order to increase the switch's capacity or capabilities. However, as mentioned above, such switches can be interconnected externally (via, e.g., cables or optical transceivers) to create a stacking system. To illustrate this, FIG. 2 depicts an exemplary stacking system 200 comprising stackable switches 100(1)-100(N), each of which is substantially similar to stackable switch 100 of FIGS. 1A and 1B. As shown, stackable switches 100(1)-100(N) are linked together via their respective stacking ports 104(1)-104(N), thereby establishing a data path 202 between the switches for communicating data traffic. With this configuration, stackable switches 100(1)-100(N) can act in concert as a single, logical switch having the combined data port capacity of the individual switches.
In the example of FIG. 2, stackable switch 100(2) is designated as the “master” switch of stacking system 200, which means that switch 100(2) serves as the point of decision making for the entirety of stacking system 200. For instance, master switch 100(2) can accept and process user commands directed to the overall configuration of stacking system 200. Master switch 100(2) can also communicate with non-master switches via the stacking ports in order to propagate various types of management commands and data to those switches.
In contrast to stacking system 200 of FIG. 2, FIG. 3 depicts an exemplary modular chassis system 300 (referred to herein as a “chassis system”). Chassis system 300 includes at least one management module (comprising a management processor) and at least one linecard module interconnected via a fabric module. Generally speaking, the stacking function of a stackable switch is similar to the fabric module of a chassis system, and the data port function of a stackable switch is similar to the line card data port function of a chassis system. However, in chassis system 300, each of these components is modular and can be added to, or removed from, chassis system 300 as needed in order to accommodate customer requirements. For instance, in the specific embodiment of FIG. 3, chassis system 300 includes two management modules 302(1) and 302(2) (for, e.g., redundancy) and three linecard modules 306(1), 306(2), and 306(3) (for, e.g., increased port capacity). Other configurations comprising more or fewer modules are possible, constrained only by the number of available module slots in chassis system 300. Thus, chassis system 300 can be considered an “internally expandable” switch system (via the additional or removal of internal management/fabric/linecard modules) while stacking system 200 of FIG. 2 can be considered an “externally expandable” switch system (via the addition or removal of external stackable switches).
One significant advantage that stacking systems have over modular chassis systems is cost; for instance, to achieve a particular data port capacity, it is usually cheaper to purchase and deploy a stacking system rather than a chassis system. However, the cost savings provided by stacking systems comes at the expense of less robust redundancy/high availability (HA) when compared to chassis systems. To understand this, note that in chassis system 300 of FIG. 3, each management module 302(1) and 302(2) has a direct connection to the other modules in system 300 via fabric module 304. Thus, the management processor of each management module knows the status of each linecard module, as well as the other management module, at all times. If one of the linecard modules fails, the management processor of the active management module can immediately isolate the faulty linecard module and re-route data traffic to another, available linecard module. Similarly, if one of the management modules fails, the other management module can take over active management duties to avoid traffic disruption.
On the other hand, in stacking system 200 of FIG. 2, the various CPU complexes of the system are not directly interconnected; instead, these CPU complexes can only communicate with the CPU complexes of their immediately adjacent stackable switches using data path 202 that is created via stacking ports 104(1)-104(N) interconnecting stackable switches 100(1)-100(N). Accordingly, if one of the stackable switches in system 200 fails, the management functions/CPU complexes of the other switches generally need to wait for a timeout on data path 202 before they can know that a failure has occurred. Further, if master switch 100(2) fails, a new master must be elected to re-form the stack. Both of these scenarios significantly increase the time needed to recover from a failure or sometimes cause traffic disruption, which means that stacking system 200 cannot provide robust HA (i.e., immediate failover with little or no downtime) for mission-critical deployments.