The quality of service (QoS) experienced by packets on a network is characterized by delay, jitter, and loss. Many network applications are sensitive to one or more of these parameters. An interactive application, for example, becomes unusable if the total delay of the network exceeds a threshold. Streaming applications, such as video, can tolerate a large fixed delay but require excessive buffer space if the jitter or variation in delay experienced by different packets exceeds a threshold. Other applications cannot tolerate packet loss, for example packets being dropped due to excessive congestion.
Most Internet traffic today is handled as a single class of traffic that is delivered on a best-effort basis. That is, the network makes an effort to deliver all packets in a reasonable amount of time but makes no guarantees about delay or loss. Due to the bursty nature of network traffic, some packets may be delayed or dropped due to congestion, and the delay experienced by packets may vary considerably from packet to packet. While best efforts delivery is adequate for many types of traffic, such as file-transfers and web page access, it is not suitable for other types of traffic, such as real-time audio and video streams.
To support traffic with real-time constraints as well as to provide a premium class of service to certain customers, some networks separate traffic into classes and provide service guarantees for each class of traffic. For example, traffic may be divided into one class that is guaranteed a constant bit rate and a constant delay, a second class that is guaranteed a minimum bit-rate, and one or more classes of best efforts traffic that are allocated different fractions of the total network bandwidth.
The class of service that a unit of transport (packet, cell, or frame) in a network receives may be encoded in the unit in a number of different ways. In an Internet packet, for example, the service class may be determined using the 8-bit type-of-service (ToS) field in the packet header. Alternately, the service class may be derived from the flow to which the packet belongs. In an ATM network, the service class is associated with the virtual circuit over which a cell is traveling.
Service guarantees are implemented through a combination of input policing and output scheduling. Input policing ensures that all traffic arriving at a router is in compliance with the appropriate service contract. When a packet arrives at a router, it is checked to determine if its arrival time is in compliance with its service contract. If the packet is compliant, it is processed normally. If a packet is out of compliance, for example if a flow that is guaranteed 10 Mbits/s of bandwidth is consuming 15 Mbits/s, it is marked. The packet may then be processed normally, have its service degraded, or be dropped, depending on network policy.
Output scheduling determines the order in which packets leave the router over a particular output channel. Scheduling is a key factor in guaranteeing a particular quality of service. For example, to guarantee a constant bit rate to a particular flow, the packets of this flow must be scheduled ahead of the packets from a bursty best-efforts flow that may itself exceed the available capacity. To ensure low jitter on a streaming flow, each packet must be scheduled to depart the router in a narrow window of time.
Output scheduling for guaranteed bit rate (GBR) traffic is often performed using a multiple queue structure as illustrated in FIG. 6. A separate output queue 501-503 is provided for each class of traffic (class-based queuing (CBQ)) or for each network flow (per-flow queuing). Three queues are shown in the figure, but other numbers are possible. Each queue contains some number of packets (e.g., packet 521) awaiting transmission over the output line 531. Associated with each queue is a counter 511-513 that indicates when the next packet should depart the queue. For queues associated with constant bit-rate traffic, for example, the counter indicates the time at which the next packet may depart the queue. After a packet departs the queue, the counter is updated with the present time, plus an increment that reflects the length of the packet divided by the bandwidth allocation. This method of update corresponds to the ATM ‘leaky-bucket’ algorithm for constant-bit-rate QoS.
Output scheduling for best-efforts traffic is often performed using weighted-fair queuing (WFQ). Such traffic is also scheduled using the apparatus illustrated in FIG. 6, with a separate queue for each class of traffic or for each flow. For best efforts traffic, however, the packets in the queues have not been allocated a guaranteed bit rate on the output port. Rather, each queue has been allocated a fraction of available bandwidth and packets are scheduled on a best efforts basis according to the available bandwidth. For queues associated with this best-efforts traffic the counters 511-513 associated with each queue are incremented according to a weighted-fair queuing algorithm and, when output bandwidth is available, the queue associated with the smallest count is selected to transmit and its counter incremented by the reciprocal of its bandwidth ‘share’.
In prior-art routers, this task of searching for the lowest counter, selecting the appropriate queue, and updating the counter associated with this queue is typically performed in software running on a microprocessor. Because of the processing overhead required for transmitting each packet with class-based queuing or per-flow queuing, these mechanisms are usually restricted to use on low-speed links and with a moderate number of queues (10s not 100s). (c.f., Ferguson and Huston, Quality of Service, Wiley, 1998, p. 61).