The growth in demand for telecommunication services is increasing at an ever-quickening pace. The majority of the demand is being driven by the explosion in the use of the Internet and a steady stream of new applications being introduced which further increases the demand for increased bandwidth. Currently, a large portion of Internet traffic is still carried by circuit switched transport facilities. In the case of Metropolitan Area Networks (MANs), most of the traffic is transported over SONET/SDH based networks most of which were originally designed for voice traffic. With time, more and more customers are using the networks for transporting data rather than voice.
The requirements for networked communications within the user community have changed dramatically over the past two decades. Several notable trends in the user community include (1) the overwhelming domination of Ethernet as the core networking media around the world; (2) the steady shift towards data-oriented communications and applications; and (3) the rapid growth of mixed-media applications. Such applications include everything from integrated voice/data/video communications to the now commonplace exchanges of MP3 music files and also existing voice communications which have begun to migrate towards IP/packet-oriented transport.
Ethernet has become the de facto standard for data-oriented networking within the user community. This is true not only within the corporate market, but many other market segments as well. In the corporate market, Ethernet has long dominated at all levels, especially with the advent of high-performance Ethernet switching. This includes workgroup, departmental, server and backbone/campus networks. Even though many of the Internet Service Providers (ISPs) in the market today still base their WAN-side communications on legacy circuit oriented connections (i.e. supporting Frame Relay, xDSL, ATM, SONET), their back-office communications are almost exclusively Ethernet. In the residential market, most individual users are deploying 10 or 100 Mbps Ethernet within their homes to connect PCs to printers and to other PCs (in fact, most PCs today ship with internal Ethernet cards) even though the residential community still utilizes a wide range of relatively low-speed, circuit-oriented network access technologies.
The use of Ethernet, both optical and electrical based, is increasing in carrier networks due to advantages of Ethernet and particularly Optical Ethernet, namely its ability to scale from low speeds to very high rates and its commodity-oriented nature. With the rapid increase in demand for user bandwidth, and the equally impressive increase in the performance of Ethernet with the LAN environment, the demand for Metropolitan network performance is rapidly increasing. In response, there has been a massive explosion in the amount of fiber being installed into both new and existing facilities. This is true for both the corporate and residential markets.
In metro Ethernet markets, one of the parameters that can be selected is the Quality of Service (QoS). Quality of service is a term which refers to the set of performance parameters that characterize the traffic over a given connection. Several different classes or levels of QoS are defined, two of which are committed traffic and best effort traffic. To enable many services in the metro Ethernet market, a critical QoS parameter is committed information rate (CIR) versus excess information rate (EIR). Committed traffic is guaranteed to make it through the network with a very high probability of success with a very low probability of being dropped. This is a higher class of service and the customer pays a premium for it.
Excess traffic, however, is not guaranteed to make it through the network and may be provided on a best effort basis. This means that the committed traffic is serviced first and excess traffic is serviced using any bandwidth left in each section in the system. Note that EIR is usually not a service of its own but rather is the EIR portion of the same service. For example, a policer may be used at the ingress of the provider network to decide which part of the traffic of a service is excess traffic and therefore should be marked as discard-eligible, and which is committed traffic and therefore should not be marked as discard eligible. Committed and Excess traffic of a single service (and having the same priority) should use the same queue in order that there is no misordering between packets (or frames) belonging to the same service. As described below, different frames of the same service may be marked as committed or excess traffic according to the bandwidth profile defined in the Service Level Specification (SLS) of that service. From an overall network point of view, the expectation of the service provider and the customer is that if a customer pays a premium for the committed bandwidth of the service, then committed customer traffic will not be dropped. The expectation of the service provider is that the excess traffic will always be dropped before committed traffic is dropped, if at all. Note also that excess traffic is not the same as best effort traffic. For example, there may be a high priority service with excess traffic that is not within its SLS profile. Diff-Serv is another example where there are two per hop behavior (PHB) families (among others): (1) assured forwarding service (RFC 2597) and (2) best effort wherein the discard eligibility traffic is part of the assured service family.
To be able to distinguish between committed traffic and excess traffic, in the edge of metro networks, the traffic is classified and policed according to the Service Level Agreement (SLA). The traffic identified from the SLA or from the results of a traffic policing mechanism as excess traffic is marked as discard eligible while the traffic identified as committed traffic is marked as non-discard eligible. There are many methods of marking the packets as discard eligible traffic. In the case of ATM cells, the Cell Loss Priority (CLP) bit in the header of ATM cells may be used to indicate that the packet is discard eligible. In the case of Ethernet IP packets, the Differentiated Services Code Point (DSCP) bits in the IP header can be used for the discard eligible information as defined in RFC 2597. For a detailed discussion of the specifications of SLA, CIR and EIR in Metro Ethernet Networks, see MEF 1: Ethernet Services Model—Phase I and MEF 5: Traffic Management Specification—Phase 1.
To meet the committed traffic requirements in a single queue queuing system, excess traffic should always be dropped before committed traffic and if possible, committed traffic should never be dropped. Note that this is typically a requirement even if there are multiple queues (e.g., one queue per priority) or if the committed and excess traffic belong to the same service, in which case it is forbidden to place them in different queues since this would cause the misordering of packets that belong to the same service. As long as the total bandwidth of incoming traffic to a specific link is less than the available link bandwidth, all excess and committed traffic is passed. Due to the bursty nature of data traffic (e.g., file transfer, Internet, etc.), however, the total incoming bandwidth destined to a specific link may at times exceed the total available link bandwidth. It is for this reason queues are implemented to store the data traffic until such time that it can be sent over the link. Queues, however, have limited size and if a queue is in the full state incoming traffic begins to be dropped. If incoming traffic is dropped based solely on the queue full status, than whether committed or excess traffic is dropped cannot be controlled. This is because all the packets that are received when the queue is full must be dropped and specific incoming packets cannot be controlled in relation to the status of the queue.
A solution to this problem is to set a threshold ‘T’ in the queue and to implement the following dropping algorithm. If the level of the queue is below threshold T, then accept all traffic including both committed traffic and excess traffic. If the queue is full, drop all traffic. If the queue is not full, but above threshold T, then accept only committed traffic and drop all excess traffic.
A diagram illustrating an example queue having a threshold above which excess traffic is dropped is shown in FIG. 1. The queue, generally referenced 20, has a threshold T 26 below which both committed and excess traffic are accepted 22 and above which only committed traffic is accepted 24. The upper portion 24 of the queue should be large enough to compensate for a burst constituting the largest expectant difference between incoming committed traffic and queue output bandwidth. If the upper portion is sufficiently large, committed traffic will never be dropped. It is possible, however, that a very large burst may occur wherein some committed traffic will be dropped. This, however, can be considered as overbooking.
Considering a distributed queuing system, a problem arises when attempting to enforce the policy of precedence of committed over excess traffic. A block diagram illustrating an example prior art distributed queuing system including a plurality of core switches interconnected by network communication links is shown in FIG. 2. The example network (which can be a MAN), generally referenced 10, comprises a plurality of core switches 12 connected via communication links 18. Each core switch comprises a switching fabric 16 and one or more line cards 14. The line cards provide the transmit and receive interfaces for the network communication links on one end and provide connections to the switching fabric on the other end.
In a typical implementation of core switches such as shown in FIG. 2 there are a plurality of interface cards (or line cards) combined with one or more switch cards located inside a chassis. The switch cards are responsible for implementing the switching fabric and forwarding the traffic between the line cards. The majority of off the shelf integrated circuits (ICs) available for implementing the high speed switching fabric have no packet discard eligibility and excess traffic marking and processing capabilities. Further, the scheduling algorithm used by these integrated circuits to switch traffic of a single priority generated by multiple line cards going to the same destination line card is some variation of a round robin algorithm.
A block diagram illustrating an example prior art scheme whereby the output of several input queues is forwarded by a scheduler to an output queue is shown in FIG. 3. In this example of the scheduling scheme, a plurality of input queues 32 has data destined to the same output queue 36. In most switches, the scheduler 34 is operative to schedule the transfer of packets from the several input queues to the output queue using a round robin type algorithm.
Assuming that there is virtual output queuing (i.e. one queue per destination interface or one queue per interface card per priority) in the ingress path (i.e. traffic coming from a network link into a line card and going into the switch fabric) of each line card and that each individual queue implements the committed over excess traffic single queue algorithm, then committed traffic should never be dropped due to excess traffic coming into a line card. It cannot be guaranteed, however, that committed traffic from one line card will not be dropped due to excess traffic coming from another line card. This is the problem in a distributed queuing system.
Consider the following example to illustrate the problem. A switch chassis comprises three line cards each having 10 Gbps capacity. Two line cards attempt to send a total of 10 Gbps traffic to the third line card. The switch fabric, using a round robin algorithm, divides the 10 Gbps bandwidth of the third card evenly between the two line cards allotting each 5 Gbps. Now, if each of the line cards transmits less than 5 Gbps committed traffic, there is no problem. However, if the committed traffic comprises 7 Gbps from the first line card and only 3 Gbps from the second line card, then the destination line card will receive only 5 Gbps of committed traffic and no excess traffic from the first line card and 3 Gbps committed traffic and 2 Gbps of excess traffic from the second line card. Thus, 2 Gbps of excess traffic from the second line card was forwarded at the expense of 2 Gbps of committed traffic from the first interface card.
Note that in the general case, there may be more than a single destination per line card due to several reasons including (1) each line card may contain multiple ports, links and interfaces wherein each constitutes a separate destination, and (2) in a system supporting priorities (or any other mechanism for supporting multiple classes of service) multiple queues are required for each output port in order that packets (or frames) of different priorities can be processed differently, wherein each queue can logically be viewed as a different destination for the purposes of this invention and related disclosure.
A solution to the above problem is to use a weighted round robin algorithm in the switching fabric and to configure the weights for each line card according to the provisioned committed bandwidth coming from each line card to the destination line card. Note, however, that this solution works as long as there is no overbooking of committed traffic. Using the example illustrated above, the weight of the first line card is configured to 7 and the weight of the second line card is configured to 3. Thus, the first line card is allotted 7 Gbps out of the available 10 Gbps and the second line card is allotted 3 Gbps out of the available 10 Gbps. In each line card, the single queue algorithm described above handles the dropping of the excess traffic and prevents the committed traffic from being dropped.
Although this solution works, it has several disadvantages. Firstly, the switch fabric itself may not support weighted round robin between different input sources in which case it cannot be implemented. Secondly, the switch fabric weights must be reconfigured each time a new service is provisioned or an existing service is changed so that the weights reflect the new committed rate distribution between the line cards. Thirdly, this solution does not work if the committed traffic is oversubscribed during the provisioning stage.
An illustrative example of this third disadvantage is provided. Consider a switch chassis with three line cards, each line card having a capacity of 10 Gbps. Two line cards attempt to send a total of 10 Gbps of traffic to the third line card. We provision 9 Gbps of committed traffic from the first line card and 6 Gbps from the second line card. Note that the total traffic provisioned is 15 Gbps which is more than the 10 Gbps available.
If the fabric weights are configured according to the committed traffic distribution, the first line card will receive 60% (9/15=6 Gbps) and the second line card will receive 40% (6/15=4 Gbps). Now consider the first line card attempts to forward an actual committed bandwidth of 4 Gbps and the second line card attempts to forward an actual bandwidth of 6 Gbps. A problem arises since the first line card will forward 6 Gbps (4 Gbps committed, and 2 Gbps excess) while the second line card will forward only 4 Gbps resulting in the dropping of 2 Gbps of committed traffic.
Thus, there is a need for a mechanism for enforcing the precedence of committed over excess traffic in a distributed queuing system that overcomes the disadvantages of the prior art. The mechanism should be able to support SLA policies of committed over excess traffic without dropping committed traffic (as long as the total about of committed traffic to be forwarded in each link does not exceed the amount of bandwidth in that link).