The desire to integrate data, voice, image, video and other traffic over high speed digital trunks has led to the requirement for faster networks including the capability to route more information faster from one node to another node. A switch performs this routing of information. Generally, the switch consists of three logical portions: ports, a switch fabric and a scheduler.
Routing and buffering functions are two major functions performed by a switch fabric. New packets arriving at an ingress are transferred by the scheduler across the switch fabric to an egress. The ingress refers to a side of the switch which receives arriving packets (or incoming traffic). The egress refers to a side of the switch which sends the packets out from the switch.
Most of the switches today are implemented using a centralized crossbar approach. FIG. 1 is an exemplary illustration of a centralized crossbar switch. The packets arrive at the centralized crossbar switch 100 at multiple ingress ports 105 on the ingress 102. They are transferred across the switch fabric 110 to multiple egress ports 115 on the egress 104 and then sent out to an output link (not shown). The centralized crossbar switch 100 can transfer packets between multiple ingress port-to-egress port connections simultaneously.
A centralized scheduler controls the transfer of the packets from the ingress ports 105 to the egress ports 115. Every packet that arrives at the ingress ports 105 has to be registered in the centralized scheduler. Each packet then waits for a decision by the centralized scheduler directing it to be transferred through the switch fabric 110. With fixed size packets, all the transmissions through the switch fabric 110 are synchronized.
Each packet belongs to a flow, which carries data belonging to an application. A flow may have multiple packets. There may be multiple flows arriving at the ingress ports 105 at the same time. Since the packets in these multiple flows may be transferred to the same egress port, each of these packets waits for its turn in ingress buffers (not shown) in the ingress 102.
The centralized scheduler examines the packets in the ingress buffers and chooses a set of conflict-free connections among the appropriate ingress ports 105 and egress ports 115 based upon the configuration of the switch fabric 110. One of the egress ports 115 may receive packets from one or more ingress ports 105. However, at any one time, the centralized scheduler ensures that each ingress port is connected to at most one egress port, and that each egress port is connected to at most one ingress port.
Each packet transferred across the switch fabric 110 by the centralized scheduler waits in egress buffers (not shown) in the egress 104 to be selected by the centralized scheduler for transmission out of the switch. The centralized scheduler places the selected packets in the appropriate egress ports 115 to have the packets transmitted out to an output link.
The centralized scheduler may not be able to transfer packets from the ingress 102 across the switch fabric 110 at a same pace that new packets arrive at the ingress ports 105. Ingress buffers are used to store the new packets when there is available space. When the ingress buffers overflow, congestion occurs at the ingress ports 105. The ingress buffers are part of an input queue. When there is no packet dropping policy, all arriving packets are dropped regardless of properties of the packets (e.g., packet size, etc.).
Generally, the packet dropping policies are designed to provide fairness to network applications, among others factors (e.g., increase network utilization, etc.). Depending on the type of technology, fairness may be implemented differently. For example, ATM (asynchronous transfer mode) networks can support multiple traffic types (e.g., voice, data, video traffic, etc.), and applications associated with these traffic types may behave differently (e.g., burst data, etc.).
There are different packet dropping policies available, and each may implement a different fairness criteria (e.g., packet size, traffic type, etc.). For example, in a “drop tail” (DT) packet dropping policy in ATM technology, all arriving cells in a packet are dropped when the ingress buffers are full. That is, packets from applications having high priority (e.g., video applications) are dropped similar to packets from applications having low priority (e.g., electronic mail applications). The DT packet dropping policy is not practical because it treats packets from different traffic types the same.
FIG. 2 is an exemplary graph illustrating a random early detection (RED) packet dropping policy. RED is one approach to solving the queue overflow problem by randomly dropping cells using thresholds. Using RED, an occupancy level of the input queue is monitored. The occupancy level indicates how much space of the input queue is occupied. This occupancy level is compared to pre-set thresholds such that dropping decision can be made. RED is random because the cells are randomly dropped using a probability. Using RED, cells are dropped early before the input queue overflows.
Referring to FIG. 2, the vertical axis 200 represents a probability for an arriving cell to be dropped. When the probability is 1, all arriving cells are dropped. The horizontal axis 202 represents an average occupancy level or an average queue length. A minimum buffer occupancy threshold 205 represents a value below which all cells arriving at the input queue are admitted into the buffer. A maximum buffer occupancy threshold 210 represents a value above which all cells arriving at the input queue are dropped. The area between the minimum occupancy threshold 205 and the maximum occupancy threshold 210 represents an increasing level of drop probability. For example, at an average queue length 207, there is a 30% probability that cells arriving at the input queue are discarded. However, at an average queue length 209, there is a 90% probability that cells arriving at the input queue are discarded. As the average queue length increases (i.e., the input queue becomes more filled), the probability of dropping arriving cells increases. When a maximum threshold 210 is reached (i.e., high occupancy level), the probability of dropping arriving cells is 1. The average queue length in the RED packet dropping policy is based on an aggregate occupancy of all flows, and as such, only one minimum threshold and one maximum threshold is necessary. Thus RED provides some level of fairness by using random dropping based on occupancy thresholds and drop probability.
FIG. 3 is an exemplary chart illustrating a RIO (RED with in and out profiles) packet dropping policy. The RIO packet dropping policy is a variation of the RED packet dropping policy having multiple profiles 305 and 310. The RIO packet dropping policy also operates with a packet dropping probability and thresholds. The profile 305 causes cells to be dropped at a lower minimum threshold 315 than the profile 310 having a minimum threshold 320. The profile 305 has a lower maximum threshold 325 than the profile 310 having a maximum threshold 330.
Packet dropping decisions using the DT policy, the RED policy and the RIO policy are based on buffer occupancy. There is no flow consideration since these policies provide no mechanism to identify or distinguish flows. This does not provide adequate fairness because packet-dropping decisions made based on occupancy level may allow some flow to get more than its fair share of the input queue than others. For example, a high bandwidth application may exhaust space in the input queue preventing a low bandwidth application from occupying any space in the input queue. The DT, RED and RIO packet dropping policies cannot be used to support quality of service (QoS) guarantees because there is no mechanism to isolate different classes of traffic. QoS specifies a guaranteed throughput level such that a time it takes for a packet to travel from a source location to a destination location will not exceed a specified level.
What is needed is a packet dropping policy that provides fairness based on flow isolation and traffic type or class isolation.