A communication router is typically comprised of two parts: a modular electronic circuit such as a line card(s) and an interconnecting architecture such as a switch fabric, where the switch fabric provides the interconnecting function for a plurality of line cards. Further, to facilitate data communication across a router, a variable length data packet(s) is often divided into fixed-length cells in a line card prior to forwarding to a switch, and ultimately on to a device associated with the switch, e.g., a memory device or other component.
FIGS. 13 and 14 illustrate conventional single-switch architectures, with FIG. 13 illustrating system 1300 comprising a shared memory switch and FIG. 14 illustrating system 1400 comprising a crossbar-based switch. As illustrated, a plurality of line cards (e.g., FIG. 13, line cards 1310A-Y, and FIG. 14, line cards 1410A-Y, where A-Y are positive integers) are associated with a respective port processor on each line card (e.g., FIG. 13, port processors 1330A-Y, and FIG. 14, port processors 1430A-Y).
As illustrated in FIG. 13, in a shared memory switch, the port processor 1330 on line card 1310 writes data packet(s) 1355 into, and reads data packet(s) 1355 from, memory 1350, where memory 1350 is shared by all the port processors 1330 via lines 1340 (e.g., a serial link) comprising the switch architecture 1300. A problem with a switch architecture of this nature is that memory 1350 must operate at a speed M times the link rate to satisfy the demand(s) placed on memory 1350 by any or all of line cards 1310 and/or port processors 1330, where M is the number of input (output) ports of the switch. As the required M increases (e.g., more line cards 1310 added), constructing memory 1350 becomes costly. Further, the power consumption of moving data packets in and out of a shared memory switch is also high.
A crossbar-based switch system 1400, as illustrated in FIG. 14, comprises a plurality of line cards 1410A-Y and respective port processors 1430A-Y connected to a crossbar switch 1450 and a scheduler 1460, with multiple m×m (m inputs and m outputs) data lines 1470 (heavy line) running in parallel, and control lines 1480 (narrow line) running in parallel. Data packets 1455 are transmitted between port processors 1430 to crossbar switch 1450 via data lines 1470 (paths 1470 are typically called the ‘data path’), and control packets are exchanged between port processors 1430 and scheduler 1460 (paths 1480 are typically called the ‘control path’), where request(s) 1490 received at, and grant(s) 1495 generated by, scheduler 1460 are utilized to control transmission of respective data packets 1455 across any of data paths 1470. Typically, the bandwidth requirement for a data path 1470 will be much greater than for a control path 1480. With a crossbar-based switch, when a data packet arrives at a line card 1410 for transport by crossbar switch 1450, a request token 1490 is sent to scheduler 1460 by a port processor 1430. A request 1490 by a particular output port processor (e.g., any of 1430) is recorded in a counter inside scheduler 1460. Once scheduling is determined, scheduler 1460 returns a grant token 1495 to the requesting port processors, via the control path. Upon receipt of a grant token 1495, a port processor 1430 transmits a data packet(s) 1455 corresponding to a destination in crossbar switch 1450 defined in the received grant token 1495. In general, with a crossbar switch system data packet(s) 1455 will be moved in and out of the port processor 1430 at a speed comparable to the link rate supportable on data lines 1470. In contrast, a shared memory switch (as illustrated in FIG. 13) has to move data in and out of port processor 1330 at a speed M times the link rate. Typically, crossbar switches 1450 do not buffer data packets and comprise minimal logic. Hence, a crossbar switch consumes significantly less power than a shared memory switch.
However, single-stage switches have a scalability problem. Every line card in a single-stage switch architecture requires at least one high-speed link terminating on a shared memory chip or a crossbar chip (or the scheduler chip), but the number of high-speed serial links is limited by available technology.
In response to the scalability problem 3-stage switches have been proposed as a possible solution. FIG. 3 illustrates a switch using a three-stage Benes-Clos topology. Adoption of Benes topology in 3-stage switches enables the construction of an m2×m2 (m2 input ports and m2 output ports) switch out of single-stage switch modules of size m×m. Conventionally, crossbar-based architectures have limited, if any, application in 3-stage switches because there is no simple way to design a scheduler able to control traffic over the respective crossbar switches. Commercial 3-stage products, such as routers provided by JUNIPER and CISCO, are all based on a shared memory architecture (e.g., a buffered network). However, a buffered approach can lead to out-of-sequence transmissions over a 3-stage switching fabric because there are m (m=N1/2) paths in the switch and data packets are randomly routed through these paths. Attempting to re-sequence packets at 40- or even 100-Gbps can be a substantial task. To overcome the random routing of data packets a large amount of memory for data packet re-sequencing is required. Furthermore, a buffered architecture also has a problem with high-power consumption and is not compatible with optical switching technologies, where such optical switching technologies are un-buffered in nature.
Some multiple-stage crossbar switches have been proposed to address the foregoing issues. In one instance, an optical banyan network has been proposed as a packet switch for local area networks (LANs). Since a banyan network is non-blocking for a round-robin (RR) connection pattern, a time division multiplexing (TDM) banyan network can be utilized, where each input is connected to all outputs in a round robin manner. While a scheduler component is not required for such a TDM banyan network, a problem with this approach is that a TDM crossbar has poor performance unless traffic is uniformly distributed among the outputs, which is generally not the case in a packet network. Further a cascade approach comprising two TDM crossbars, with virtual output queue (VOQ) buffers inserted therebetween, has been proposed in the load-balanced switch. The first TDM crossbar evenly distributes packets to its output ports and creates a uniform traffic pattern for the second TDM crossbar. The cascade approach addresses a problem with an assumption of invalid uniform traffic. However, the cascade approach creates out-of-sequence transmissions in a similar manner to that of a buffered multi-stage switch. Hence, packet re-sequencing at the speed of 100-Gbps may be as challenging as designing the scheduler for a large switch.