The present invention relates to communications networks and more particularly, to the switching nodes of those networks implemented from very high-speed fixed-size packet switch fabrics.
In recent years, the explosive demand for bandwidth over communications networks has driven the development of very high-speed switching fabric devices, some resulting in commercial offerings. The practical implementation of network switching nodes, capable of handling aggregate data traffic in the range of hundredths of gigabits per second, and soon in terabits per second, is thus becoming feasible. While many different approaches are theoretically possible to carry out switching at network nodes, today's standard solution is to employ, irrespective of the higher communications protocols actually in use to link the end-users, fixed-size packet (also referred to as cell) switching devices. They are simpler and more easily tunable for performances than other solutions, especially those handling variable-length packets. Thus, N×N switches, which can be viewed as black boxes with N inputs and N outputs are made capable of moving fixed-size packets from any incoming link to any outgoing link.
An incoming link is connected to a switch fabric, indirectly, through an input port. In practice, there is a port adapter between the physical incoming link, e.g., a fiber optical connection, and the actual switch fabric input port, in order to adapt the generally complex physical protocol (and sometimes higher communications protocols as well) in use between switching nodes, to the particular switch fabric input port. Conversely, the interface between the switch fabric and the outgoing link is referred to as the output port and there is also an output adapter.
Irrespective of how the switching fabric core is actually devised and implemented this approach is characterized in that the switching fabric itself does not interface directly to any link external to the switching node. Therefore the interface between adapters and switch fabric, along with the corresponding part of the adapter, becomes an integral part of the switching node and a key parameter to consider for its architecture. Particularly, the connections between the adapters and the switch fabric is an area that requires careful design. Although, in general, it is preferable to use parallel connections as much as possible to keep cost down (since this allows the use of slower or current, i.e., inexpensive, chip technologies e.g., CMOS versus GaAs, for a same throughput) there is a number of rapidly limiting factors in this direction.
Building a very fast switch produces a large number of I/O connections since there is a multiplying factor i.e., the number of ports. A switching fabric is commonly a 16×16 or 32×32 switch which therefore has 16 or 32 fully bi-directional ports. In addition, parallel connections create a large number of wires to be handled, both on the backplane and for attaching to the switch fabric, forcing the use of expensive module and packaging solutions. Hence, to push switch performance, the other alternative is to increase speed within the limit of the chip technology in use. However, as both basic clock speed and the number of wires in each parallel connection increase, one soon starts to get problems with skew. That is, the signal on some paths arrives at a different time from the parallel signal on a different path.
Skew is a very serious limitation on the effective use of parallel connections and its control is a key design issue. Also exacerbating the problem, the drivers located at the periphery of the chip modules have to be made slower than those of the interior of the switch fabric because they have to drive higher value parasitic capacitors requiring switching more current through the parasitic inductance of the packaging and creating a problem known as simultaneous switching (ground is disturbed while drivers are toggling in synch), another drastic limitation to the use of many signal I/O's.
As a result of the above considerations, the number of wires allowed in each port, and the number of ports itself, of commercially available switch fabrics, are a careful tradeoff between the performances and limitations of the various components involved i.e., chip technology, chip packaging (module) technology and board technology, along with their respective costs in an attempt to reach the overall best cost/performance ratio for a switching node. As a consequence, a state of the art switch is a device having a maximum of a few tenths of ports e.g., 16 or 32, each having a few data I/O's per port e.g., 4 or 8 for input and the same for outputs (in order it exists practical solutions to control the skew). Also sometimes implementing a so- called 2-way data link bundling (two cells are moved IN and OUT simultaneously). And, since each port is toggled to the maximum frequency allowed by the current chip and packaging technologies this allows one to match the speed of an OC-192 line, i.e., the level 192 of the synchronous optical network (SONET) US hierarchy, i.e., 10 gigabits/s (equivalent to the European 64th level of the Synchronous Digital Hierarchy or SDH and called STM-64) over each in and out port yielding to a 128 gigabits/s aggregate throughput switch.
On the other hand, another very important item that shapes the design of switch fabric devices is flow control. A very simple illustration of the need for a flow control mechanism in a switch is to observe that when more than one data packet attempts to access an output port simultaneously (all input ports may want to access the same output port at any given instant), then a conflict occurs. When this happens, only one of the contending packets can be read out. Other data packets either have to be stored in a buffer or queue, until they can actually be read out, or must be dropped. Although various buffering types are encountered, many of the recent switches have adopted output-queuing, that is, when a packet is arriving and handled in a switch, it is immediately placed in a queue that is dedicated to its outgoing port, where it waits until departing from the switch.
This approach will maximize the switch throughput provided that no input or output is oversubscribed. In this case, the switch is able to support the traffic and the queue occupancies remain bounded. In practice however, output-buffered switches are not free of complications. In particular, a N×N switch requires that the internal bandwidth be N times the input bandwidth. In addition the internal memory space needed in the switch fabric is limited by what the chip technology can reasonably permit (die size, which is by far the primary contributor setting the cost of a chip, limits the amount of internal memory that can be implemented). Under unfavorable traffic conditions, e.g., with a high degree of congestion, the limited on-chip memory has traditionally led to poor throughput, especially when FIFO (First In First Out) input queues are used at the input side of the switch fabric, i.e., in the input adapter, to store cells that could not be temporarily accepted by the switch fabric. This is bound to create a memory full status. Because simply deploying more on-chip memory to solve the problem is not economically feasible (even though memory cost has dramatically dropped over the years) a switch fabric end to end traffic management has thus become an essential aspect of a switch design to ensure that no packets are lost, due to congestion and high utilization, while warranting fairness regardless of the traffic patterns received through the input ports.
To this end, replacing the FIFO queues by VOQ's (Virtual Output Queue) in the input adapter, has contributed to eliminating the well-known HOL (head-of-line) input blocking problems encountered in switches that are also using input-queuing because VOQ provides that any packet in a queue, irrespective of its order of arrival, can be processed provided that the individual port output buffer, to which the packet is destined, is not full. However, the VOQ mechanism can only work if it has knowledge of the status of the output buffers, i.e., it must know which ones are full and which ones can still receive cells. This has necessitated the implementation, in the output adapter, of an output queue grant-based flow-control mechanism. This mechanism is aimed at passing a grant vector of N bits, one per output, over which classes of priority, handled by the switch, can be time-multiplexed. This is accomplished at the expense of having to add more signal I/O's to the switch fabric.
Much more on switching and switches can be found in the abundant literature that exists on the subject of switch architecture, their design and limitations and packet switching networks in general. For example, a good review of switches can be found in chapter 5 of “Asynchronous Transfer Mode Networks Performance Issues”by Raif O. Onvural, Artech House, 1995 and also in a publication by the International Technical Support Organization of IBM, Research Triangle Park, N.C. 27709, under the title “Asynchronous Transfer Mode (ATM) Technical Overview, no. SG24-4625, October 1995.
Therefore, commercially available fixed-size pack et switch fabrics are carefully crafted to best take advantage of all the capacities of current chip technologies, especially their intrinsic internal speed, while successfully avoiding the limitations imposed by the packaging, characterized by a scarcity of I/O resources and a drastic limitation in the number of inter-connections that would otherwise be necessary. The result is hardware having a maximum of a few tenths of ports (e.g., 16 or 32), running at very high-speed (e.g., OC-192 at 10 Gigabits/second) and capable of handling the full traffic of all ports without any loss thanks to a sophisticated flow control put in place to manage the congestion.
It practice however, it remains very difficult to take advantage of the full performance of every port. Not all applications require all ports to be of that speed. On the contrary, many applications of switch fabrics, even though they are attempting to utilize the full throughput capacity of the switch, require that a much larger number of lower-speed ports be accommodated in a switching box instead. Switch fabrics are expensive hardware. When building boxes, it is desirable to combine in the switch fabric port adapters a number of lower speed lines to reduce costs.
For example, a port adapter, instead of being connected to a single OC-192 line may have to be connected to four (independent) OC-48 lines each at 2.4 gigabits per second, or to sixteen OC-12 lines at 622 megabits per second, so as to implement a switching node comprised of a much larger number of ports, hereafter denominated sub-ports (since they are derived from a native switch fabric port). For example, implementing from a 16×16 switch fabric, a 256×256 switch box concentrating OC-12 lines or any other combination. Unfortunately, switch fabric ports do not scale down well because of the sophisticated flow control mechanism put in place (in an I/O constrained environment) to accommodate a single high-speed and which are unable to work well if many independent lower-speed line are connected to them instead. To illustrate this, a port adapter handling, e.g., four OC-48 lines, has no means to report a congestion occurring on a particular path, while others are not congested. The only solution is to report a global congestion for that port even though 3 lines out of 4 in this case could continue to receive traffic. This triggers a gross under-utilization of the capacity of the port and defeats the objective of trying to take advantage of the full switch capacity.
Therefore, it is a purpose of the invention to remedy the shortcomings of the prior art, as noted above, while fully taking advantage of the intrinsic performance of a N-port switch fabric used to build a M-port switching function concentrating, through port and sub-port adapters, the traffic of more than N lines.
It is another purpose of the invention to take into account the individual traffic of all sub-ports, indirectly connected to the switch fabric ports, thus enabling an overall flow control of a switching function irrespective of the physical organization of the core switch fabric in use.
It will be apparent to those skilled in the art having regard to this invention that other modifications of this invention beyond those specifically described here may be made without departing from the spirit of the invention. Accordingly, such modifications are considered within the scope of the invention as limited solely by the appended claims.