Network routers and multi-layer switches generally include a plurality of ports operatively coupled to communications links over which the switch or router exchanges packets with other network nodes. A router or switch is adapted to transmit an inbound packet, received on a port, toward a destination reachable through any of the other ports. Inside the router or switch, the packets destined for an outbound port can be classified into a plurality of traffic flows in order to apply the appropriate processing and quality of service, for example. A priority and/or weighted queue structure and output scheduler are typically employed to coordinate the processing and transmission of these flows in a manner that fairly allocates output bandwidth to the competing flows.
An example hierarchical queue structure 100 illustrated in FIG. 1 includes a plurality of packet buffers or queues organized into three levels of hierarchical data packet flows. At the first level of queues 101-103, each individual queue 104 is associated with a single flow 110 identified by a classified mechanism in the router or switch. The packets of the flows 110 are enqueued by an enqueuing function, temporarily buffered in a queue 104, dequeued by a dequeue function associated with the queue 104, processed in accordance with one or more processing operations 112, and then mapped 114-116 into a second level of queues. Each of the flows 120 at the second level of queues is buffered in an individual queue 105-107 while they await processing associated with the particular level. After being dequeued, the flows at the second level may undergo one or more processing operations 113 before being mapped 118 into a single flow at the third level of queues where the packets are buffered in the “final” queue 108 immediately prior to transmission from the egress port.
In the hierarchical queue structure 100, each individual packet is enqueued and dequeued into and out of multiple queues as it propagates from the first level of queues to the last. At any given enqueue or dequeue point, the queue structure 100 may make a decision to continue to process the packet or discard it. A packet may be discarded prior to being enqueued or after being dequeued where, for example, (a) the queue is full or nearly full; (b) the queue descriptors used to manage packets within the queue are empty or low; (c) the amount of data in a given time has exceeded maximum allotted storage for the flow being enqueued; (d) the time spent in queue exceeds the maximum allowable time; (e) a buffer pool is empty or low; or (f) the data destination line is down or otherwise inoperable.
A queue structure with multiple levels of hierarchical queues has a number of drawbacks. The queue structures require (a) valuable system resources including system memory, (b) additional flow processing needed to make decisions whether to pass or discard a packet, and (c) significant processing time that increases latency, i.e., the time necessary for the packet to propagate through the data communications system. With respect to system memory, queues consume a significant memory to (a) store queue descriptors, (b) support basic queue management for tracking head/tail pointers, the queue itself, and timers; and (c) retain queue state information including links relating hierarchical queues.
With respect to processing resources, queue structures require a significant number of clock cycles to process the enqueue function and dequeue function for each of the plurality of queues. Contributing to the burden imposed on the processing resources, the presence of hierarchical layers of queues in the prior art introduces additional processing inefficiencies. In particular, the processing operations and enqueue/dequeue functions performed on behalf of a packet from the first to last queue level are associated with separate software code executed as separate tasks. For example, the initial task A 160 associated with a packet of flow 101 involves enqueueing the packet in first level queue 104. The processor then goes off to perform some other task. At a later point in time, a second task B 161 checks the queue 104 and dequeues the packets for further processing. As represented by third task C 162 and fourth task D 163, the process of separately enqueuing and dequeueing a packet is repeated for each packet at each level of the hierarchical-based queuing structure. For software based systems, the act of performing consecutive tasks in a non-continuous fashion consumes significant processing time, thereby increasing the latency (i.e., signal delay) through the data communications system and reducing the time available to the CPU to perform other operations. For silicon based systems, these queues take up significant space on the chip and a similar increased latency also applies.
There is therefore a need for a technique to map multiple flows to a single output while reducing the computational and data storage burdens imposed on system resources associated with enqueuing and dequeuing functions.