This invention relates to packet switched communication networks and, more particularly, to traffic shaping for causing the time multiplexed packet flows at queuing points within such networks or network elements to conform to specified traffic descriptors.
For other concurrent filings on traffic shaping see application Ser. No. 08/872,327 now U.S. Pat. No. 6,064,677 by Christopher J. Kappler et al., entitled xe2x80x9cMultiple Rate Sensitive Priority Queues for Reducing Relative Data Transport Unit Delay Variations in Time Multiplexed Outputs from Output Queued Routing Mechanisms,xe2x80x9d application Ser. No. 08/872,756 now U.S. Pat. No. 6,064,651 by Landis C. Rogers et al., entitled xe2x80x9cRate Shaping in Per-Flow Output Queued Routing Mechanisms for Statistical Bit Rate Service,xe2x80x9d U.S. Pat. No. 5,926,459 entitled xe2x80x9cRate Shaping in Per-Flow Queued Routing Mechanisms for Available Bit Rate Service,xe2x80x9d by Joseph B. Lyles et al., application Ser. No. 08/868,287 now U.S. Pat. No. 6,038,217 entitle xe2x80x9cRate Shaping in Per-Flow Output Queued Routing Mechanisms for Available Bit Rate (ABR) Service in Networks Having Segmented ABR Control Loops,xe2x80x9d by Joseph B. Lyles, application Ser. No. 08/873,064 now U.S. Pat. No. 6,064,650 entitled xe2x80x9cRate Shaping in Per-Flow Output Queued Routing Mechanisms Having Output Links Servicing Multiple Physical Layersxe2x80x9d by Christopher J. Kappler et al.
A. Traffic Contracts/Definitions
Most applications that are currently running on packet switched communication networks can function acceptably with whatever bandwidth they happen to obtain from the network because they have xe2x80x9celasticxe2x80x9d bandwidth requirements. The service classes that support these applications is known as xe2x80x9cbest effortsxe2x80x9d service in the Internet community and as xe2x80x9cAvailable Bit Ratexe2x80x9d (ABR) in the Broadband ISDN/ATM community.
There is, however, a growing demand for network services that provide bounded jitter or, in other words, bounded packet delay variation (commonly referred to as cell delay variation in an ATM context). For example, this type of service is required for real time applications, such as circuit emulation and video. It is not clear whether and how the Internet community will respond to this demand, but the Broadband ISDN/ATM community has responded by introducing the notion of a user-network negotiated traffic contract.
As is known, a user-network ATM contract is defined by a traffic descriptor which includes traffic parameters, tolerances and quality of service requirements. A conformance definition is specified for each of the relevant traffic parameters. Accordingly, ATM services may make use of these traffic parameters and their corresponding conformance specifications to support different combinations of Quality of Service (QoS) objectives and multiplexing schemes.
Partially overlapping sets of ATM traffic classes have been defined by the Telecommunications Standardization Sector of the International Telecommunications Union (ITU-T) and the ATM Forum. In some instances, traffic classes which have essentially identical attributes have been given different names by these two groups, so the following name translation table identifies the existing equivalent counterparts:
An ATM service contract for a virtual circuit (VC) connection or a virtual path (VP) connection may include multiple parameters describing the service rate of the connection. This includes the Peak Cell Rate (PCR), the Sustainable Cell Rate (SCR) the Intrinsic Burst Tolerance (IBT), and the Minimum Cell Rate (MCR). Not all of these parameters are relevant for every connection or every service class, but when they are implied or explicitly specified elements of the service contract, they must be respected. VC connections are the primary focus of the following discussion, but it will be understood the VP connections can also be so specified. The data transport unit for an ATM connection usually is referred to as a xe2x80x9ccell.xe2x80x9d In this disclosure, however, the term xe2x80x9cpacketxe2x80x9d is sometimes used to refer to the data transport unit because this more general terminology is consistent with some of the broader aspects of the innovations.
The Generic Cell Rate Algorithm (GCRA), which is specified in ITU-T Recommendation I.371, is well suited for testing a packet or cell flow for conformance with a traffic descriptor. To perform such testing, the GCRA requires the specification of an emission interval (i.e., the reciprocal of a flow rate) and a tolerance, xcfx84. In practice, this tolerance may depend on a variety of factors, including the connection, the connection setup parameters, or the class of service. As will be seen, the GCRA can be employed as a Boolean function, where for a flow of fixed size packets or cells on a connection, the GCRA (emission interval, tolerance) is false if the flow is conforming to a peak rate or true if the flow is conforming to a minimum rate. For example, a source of cells conforms to a PCR if GCRA (1/PCR, xcfx84PCR) is false. Likewise, a connection or flow conforms to an MCR if GCRA (1/MCR, xcfx84MCR) is false. As will be appreciated the xe2x80x9cemission intervalxe2x80x9d is the reciprocal of the xe2x80x9ccell rate.xe2x80x9d
A DBR traffic contract is appropriate for a source which establishes a connection in the expectation that a static amount of bandwidth will be continuously available to the connection throughout its lifetime. Thus, the bandwidth the network commits to a DBR connection is characterized by a PCR value. Further, the cell or packet flow on such a connection complies with the traffic contract if it conforms to GCRA (1/PCR, xcfx84PCR). On the other hand, an SBR traffic contract is suitable for an application which has known traffic characteristics that allow for an informed selection of an SCR and xcfx84IBT, as well as a PCR and xcfx84PCR. An SBR or rt-SBR flow complies with its traffic contract if the flow not only conforms to GCRA (1/PCR, xcfx84PCR), but also to GCRA (1/SCR, xcfx84IBT).
As previously indicated, an ABR traffic contract is appropriate for applications that can tolerate the dynamic variations in the information transfer rate that result from the use of unreserved bandwidth. A PCR and an MCR are specified by the source establishing such a connection, and these parameters may be subject to negotiation with the network. Thus, the bandwidth that is available on an ABR connection is the sum of the MCR (which can be 0) and a variable cell rate that results from a sharing of unreserved bandwidth among ABR connections via a defined allocation policy (i.e., the bandwidth a source receives above its specified MCR depends not only on the negotiated PCR, but also on network policy). Feedback from the network enables the source application to dynamically adjust the rate it feeds cells or packets into an ABR connection. An ABR flow always complies with its traffic contract if it conforms to GCRA (1/MCR, xcfx84MCR), and is always non-compliant if it does not conform to GCRA (1/PCR, xcfx84PCR). Conformance in the region between MCR and PCR is dependent on the ABR feedback and is thus dynamically determined.
A UBR traffic contract is similar to the ABR contract, except that the UBR contract does not accommodate the specification of an MCR and has no dynamic conformance definition. Therefore, a UBR flow complies with its traffic contract if it conforms to GCRA (1/PCR, xcfx84PCR).
B. Traffic Shaping
ITU-T Recommendation I.371 addresses the possibility of reshaping traffic at a network element for the purpose of bringing the traffic into conformance with a traffic descriptor in the following terms:
xe2x80x9cTraffic shaping is a mechanism that alters the traffic characteristics of a stream of cells on a VCC or a VPC to achieve a desired modification of those traffic characteristics, in order to achieve better network efficiency whilst meeting the QoS objectives or to ensure conformance at a subsequent interface. Traffic shaping must maintain cell sequence integrity on an ATM connection. Shaping modifies traffic characteristics of a cell flow with the consequence of increasing the mean cell transfer delay.
Examples of traffic shaping are peak cell rate reduction, burst length limiting, reduction of CDV by suitably spacing cells in time and queue service schemes.
It is a network operator""s choice to determine whether and where traffic shaping is performed. As an example, a network operator may choose to perform traffic shaping in conjunction with suitable UPC/NPC functions.
It is an operator""s option to perform traffic shaping on separate or aggregate cell flows.
As a consequence, any ATM connection may be subject to traffic shaping.
The options available to the network operator/service provider are the following:
a. No shaping
Dimension the network in order to accommodate any flow of conforming cells at the network ingress whilst ensuring conformance at the network egress without any shaping function.
b. Shaping
Dimension and operate the network so that any flow of conforming cells at the ingress is conveyed by the network or network segment whilst meeting QoS objectives and apply output shaping the traffic in order to meet conformance tests at the egress.
Shape the traffic at the ingress of the network or network segment and allocate resources according to the traffic characteristics achieved by shaping, whilst meeting QoS objectives and subsequent conformance tests at the network or network segment egress.
Traffic shaping may also be used within the customer equipment or at the source in order to ensure that the cells generated by the source or at the UNI are conforming to the negotiated traffic contract relevant to the ATC that is used (see Section 5.5).xe2x80x9d ITU-T Recommendation I.371, Section 6.2.5.
C. Scheduling for Real Time and Non-Real Time Connections/Existing Tools and Techniques
As is known, if bandwidth is not divided xe2x80x9cfairlyxe2x80x9d between applications employing xe2x80x9cbest effortsxe2x80x9d Internet service or ABR ATM service a variety of undesirable phenomena may occur. See Lefelhocz, Lyles, Shenker and Zhang, xe2x80x9cCongestion Control for Best-Effort Service: Why we need a new paradigm,xe2x80x9d IEEE Network, January/February 1996, for further details on mechanisms for best effort/ABR traffic.
Most ATM switches currently are implemented with FIFO queuing. FIFO queuing exhibits pathological behaviors when used for ABR traffic (see xe2x80x9cOn Traffic Phase Effects in Packet-Switched Gatewaysxe2x80x9d, Sally Floyd and Van Jacobson, Internetworking: Research and Experience, Vol. 3, pp. 115-156 (1992), and xe2x80x9cObservations on the Dynamics of a Congestion Control Algorithm: The effects of Two-Way Trafficxe2x80x9d, Lixia Zhang, Scott Shenker, and David Clark, ACM Sigcomm 91 Conference, Sep. 3-6, 1991, Zurich, Switzerland, pp. 133-148.). FIFO also is unable to protect correctly behaving users against misbehaving users (it does not provide isolation). As a result of these deficiencies non-FIFO queuing mechanisms such as weighted fair queuing (see, for example, A. Demers, S. Keshave, and S. Shenker, xe2x80x9cAnalysis and Simulation of a Fair Queuing Algorithm,xe2x80x9d Proceedings of ACM SigComm, pages 1-12, September 1989; and A. K. Parekh xe2x80x9cA Generalized Processor Sharing Approach to Flow Control in Integrated Service Networks,xe2x80x9d Ph.D. Thesis, Department of Electrical Engineering and Computer Science, MIT, 1992.) or approximations to fair queuing such as round-robin (Ellen L. Hahne, xe2x80x9cRound-robin Scheduling for Max-Min Fairness in Data Networks,xe2x80x9d IEEE Journal on Selected Areas in Communications, Vol. 9, pp. 1024-1039, September 1991.) are often suggested.
Service classes which have inelastic bandwidth requirements often require that data be transmitted through the network with bounded jitter (i.e., bounded cell or packet delay variation). As shown by the above referenced Parekh paper, weighted fair queuing can be used to provide bounded jitter for real time streams. Moreover, Parekh""s results have recently (Pawan Goyal, Simon S. Lam and Harrick M. Vin, xe2x80x9cDetermining End-to-End Delay Bounds in Heterogeneous Networks,xe2x80x9d Proceedings of The 5th International Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV), Durham, N. H., Apr. 18-22, 1995.) been extended to prove delay bounds for systems using the closely related mechanisms of Virtual Clock (Lixia Zhang, xe2x80x9cVirtual Clock: A New Traffic Control Algorithm for Packet Switching Networks,xe2x80x9d Proceedings of ACM SigComm, pages 19-29, August 1990.) and Self-clocked Fair Queuing (S. J. Golestani, xe2x80x9cA Self-Clocked Fair Queuing Scheme for High Speed Applications,xe2x80x9d Proceedings of INFOCOM, pp. 636-646, 1994).
Thus, it is known that both elastic (Best effort/ABR) and inelastic (or real-time) services can benefit from the use of fair queuing and related algorithms.
1. Weighted Fair Queuing and Virtual Clock
Fair queuing and related algorithms (e.g., frame-based fair queuing, deficit round robin, etc.) operate on sequences of packets or other data transport units (e.g., an ATM cell is a packet. For ATM these sequences are identified by either the VCI or the VPI, while in the Internet protocol suite the identification is on the basis of  less than IP address, protocol, port greater than triples (IPv4) or flow identifiers (IPv6). In both self-clocked weighted fair queuing and virtual clock, packets are ordered (sorted) by timestamps (schemes such as round-robin provide approximations to ordering of packets by timestamps). These timestamps represent the virtual finishing time (or equivalently the virtual starting time for the packet and are computed by taking a starting time value and adding an offset obtained by multiplying the length of the packet by a weight which represents the particular packet sequence""s share of the bandwidth.
More particularly, for virtual clock the virtual finishing time is computed as:
VT(f, 0)=0
VT(f, j+1)=max{Arrival(f, j+1), VT(f, j)}+Length(f, j+1)/Rate(f)xe2x80x83xe2x80x83(1)
where:
VT(f, j) is the virtual finishing time associated with packet j of flow (virtual circuit) f;
Arrival(f, j) is the arrival time of packet j of flow f; and
Length(f, j) is the length of packet j of flow f.
Self-clocked weighted fair queuing assigns virtual finishing times according to the formula:
VT(f, 0)=0
VT(f, j+1)=max{SystemVirtualTime, VT(f, j)}+Length(f, j+1)*weight(f)xe2x80x83xe2x80x83(2)
where:
SystemVirtualTime is the virtual time associated with the packet being served (being output) at the time packet(f, j+1) arrives.
For ATM the packet length is constant because the cells are of fixed size (i.e., 53 bytes long). Consequently, rightmost term in both Expression (1) and Expression (2) becomes a per flow constant. For virtual clock the simplified expression is:
VT(f, j+1)=max{Arrival(f, j+1), VT(f, j)}+constant(f)xe2x80x83xe2x80x83(3)
For self-clocked weighted fair queuing, on the other hand, the simplified expression is:
VT(f, j+1)=max{SystemVirtualTime, VT(f, j)}+constant (f)xe2x80x83xe2x80x83(4)
In other words, an ATM queuing point which implements either virtual clock or self-clocked weighted fair queuing performs the following steps:
1) compute the maximum of (a) the current virtual time for the VC, and (b) either of i) the arrival time of the cell or ii) the system virtual time.
2) add to the results of step 1 above a per-VC constant representing that VC""s share of the bandwidth.
3) service cells (transmit them) in order of increasing values of the virtual time stamps assigned by steps 1 and 2.
2. Priority
Giving priority to one traffic class over another means that if the higher priority traffic class has cells ready for transmission, those cells are always transmitted in preference to the cells of the lower priority traffic class.
Priority mechanisms can be either preemptive or non-preemptive. This terminology comes from the operating system literature. A non-preemptive priority mechanism assigns a priority to an object (a process in the operating system world, a VC in the ATM world) at a scheduling time, and the object then retains this priority until it is served. Preemptive priority mechanisms, on the other hand, can change the priority of objects while they are waiting to be served. For example, in a preemptive system one could say xe2x80x9cschedule this VC with priority 3 but if it is not served within 200 microseconds then increase its priority up to 2.xe2x80x9d
3. Work Conserving and Non-work Conserving Queuing
Kleinrock, Queuing Systems, Vol. 2: Computer Applications, John Wiley and Sons, N.Y., N.Y. 1996, p. 113 uses the terminology xe2x80x9cwork conservingxe2x80x9d to denote any queuing system in which work is neither created nor destroyed. In keeping with this terminology a switch which, when given queued cells, always transmits cells on the outgoing link is a xe2x80x9cwork conserving switchxe2x80x9d. Switches employing a pure FIFO, weighted fair queuing or virtual clock scheduling algorithm are all work conserving. In contrast, a non-work conserving switch may choose not to send cells, even when cells are queued for transmission. As will be seen, a method of doing this is to program the switch to wait until the current time is equal to or greater than the timestamp associated with a particular cell before transmitting that cell.
Work conserving switches attempt to fully utilize the transmission link, but do not necessarily remove or prevent bursts. In contrast, non-work conserving switches can strategically delay cells so as to re-shape traffic to meet a more stringent conformance test (i.e., a GCRA with a smaller xcfx84). Additionally, a non-work conserving switch in which a given connection is only allocated a specified amount of buffering can perform a policing function (in ITU terms a UPC/NPC) by discarding or tagging cells which overflow the allotted buffer space. An example of a non-work conserving queuing system is the Stalled Virtual Clock (Sugih Jamin, xe2x80x9cStalled Virtual Clockxe2x80x9d working note, Department of Computer Science, UCLA, Mar. 21, 1994), which is an adaptation of Lixia Zhang""s Virtual Clock algorithm where virtual time is not allowed to run faster (it stalls or goes non-work conserving) than real-time. Also see, work by Scott Shenker that is available by FTP at FTP.PARC.XEROX.com.
4. Calendar Queues
A calendar queue is a time ordered list of actions, each of which is dequeued and executed when real-time is equal to or greater than the time associated with the action. Calendar queues with bounded time intervals can be represented as a linear array which is known as a xe2x80x9ctime-wheelxe2x80x9d or xe2x80x9clime-line.xe2x80x9d Time-wheels assign events to buckets relative to a pointer, where the bucket index is calculated using arithmetic modulo the wheel size. These data structures are well known in the literature as a queuing mechanism. In a time-wheel, absolute time is represented as an offset relative to the current time (xe2x80x9creal timexe2x80x9d), and each element in the array is a bucket which contains one or more actions (typically in linked-list form) which are to be executed at the time assigned to the bucket in which they reside. Any of the buckets of such a time-wheel can be empty, i.e., have no events associated with it.
For every time-wheel, there are two times of interest: tearliest and tlatest, which correspond to the head and tail pointers for the active entries in the array; where tearliest is the time of the next entry (e.g., packet or cell) to be serviced, and tlatest is the time associated with the latest (most distant in time) bucket containing a scheduled event. The difference between tearliest and tlatest cannot be greater than the length of the time-wheel, b, minus 1. This can be ensured by viewing the time as being kept modulo b, and by then ensuring that no offset (the packet length multiplied by either the rate or the weight in virtual clock or weighted fair queuing respectively) is greater than bxe2x88x921. For an ATM link running at OC-3 speeds (149.76 mbpsxe2x80x94the SONET payload rate) there are approximately 353208 cells/sec on the link. Accordingly, if 64 Kbps (voice telephony rates) flows (approximately 174 cells/sec when AAL type 1 is used) are the lowest speed connections that need to be supported, then the ratio of the highest supported rate to the lowest rate is 2029, which rounds up to 211. This ratio is the maximum offset that will get added during the calculation of virtual times. Therefore, a time-wheel of length 2030 (2048 to allow for rounding up to a power of two) is sufficient to encode the virtual times associated with circuits ranging in rates from 64 Kbps to full OC-3 link rate.
The length of a time-wheel array can be decreased by permitting an array element to contain more than one time offset. For example, if the above-described time-wheel is reduced to 256 elements from 2048, then each bucket would have eight time offsets mapped into it. Actions within a single bucket that spans multiple offsets may be performed out of order, but between buckets actions will stay in order. This reduces the amount of memory that needs to be allocated to such a time-wheel at the cost of reducing the precision of the ordering of actions in the calendar queue.
D. Traffic Shaping for Time Multiplexed Flows on Multiple Output Channels
Preferably, any traffic shaping that is needed to bring time multiplexed packet or cell flows into conformance with their traffic contracts is performed after the completion of all switching or routing operations that are required to separate flows for different output channels from each other. This permits the throughput efficiency of the multiplexer to be optimized.
However, prior output queued ATM switches generally have employed FIFO (First In-First Out) output buffers. These buffers are not capable of participating in a controlled reshaping of any of the flows that pass through them. Instead, the per-VC time multiplexed flows that are output by these buffers essentially are time multiplexed composites of the input flows that are loaded into them. Of course, these output flows are time delayed relative to the input flows because of the inherent latency of the buffers. Moreover, the cell delay variation (CDV) of one or more of these output flows may be increased if scheduling conflicts occur among the data transport limits of the different flows because these conflicts cause so-called xe2x80x9ctransmit collisions.xe2x80x9d
As will be appreciated, increased CDV is especially troublesome for traffic, such as DBR traffic, which generally has a relatively tight tolerance. Thus, if each hop between a source and a destination includes a simple FIFO output queue of the foregoing type, it may be necessary to limit the number of hops this CDV sensitive traffic is permitted to make in order to ensure compliance within its specified tolerance.
Accordingly, there is a need for more efficient and more effective traffic shaping mechanisms and processes for ATM switches and other routers that route traffic from multiple inputs to multiple outputs for time multiplexed output emission.
This invention provides rate shaping in per-flow output queued routing mechanisms for unspecified bit rate service.