1. Field of the Invention
This invention generally relates to prioritizing information for transfer across a switch interface and, more particularly, to a system and method for using a simultaneous deficit round robin (DRR) process of prioritizing the transfer of variable length information packets having different classes of service (COS).
2. Description of the Related Art
As noted in U.S. Pat. No. 6,285,679 (Dally et al.), data communication between computer systems for applications such as web browsing, electronic mail, file transfer, and electronic commerce is often performed using a family of protocols known as IP (internet protocol) or sometimes TCP/IP. As applications that use extensive data communication become more popular, the traffic demands on the backbone IP network are increasing exponentially. It is expected that IP routers with several hundred ports operating with aggregate bandwidth of Terabits per second will be needed over the next few years to sustain growth in backbone demand.
The network is made up of links and routers. In the network backbone, the links are usually fiber optic communication channels operating using the SONET (synchronous optical network) protocol. SONET links operate at a variety of data rates ranging from OC-3 (155 Mb/s) to OC-192 (9.9 Gb/s). These links, sometimes called trunks, move data from one point to another, often over considerable distances.
Routers connect a group of links together and perform two functions: forwarding and routing. A data packet arriving on one link of a router is forwarded by sending it out on a different link depending on its eventual destination and the state of the output links. To compute the output link for a given packet, the router participates in a routing protocol where all of the routers on the Internet exchange information about the connectivity of the network and compute routing tables based on this information.
Most prior art Internet routers are based on a common bus or a crossbar switch. In the bus-based switch of a SONET link, a line-interface module extracts the packets from the incoming SONET stream. For each incoming packet, the line interface reads the packet header, and using this information, determines the output port (or ports) to which the packet is to be forwarded. To forward the packet, the line interface module arbitrates for the common bus. When the bus is granted, the packet is transmitted over the bus to the output line interface module. The module subsequently transmits the packet on an outgoing SONET link to the next hop on the route to its destination.
Bus-based routers have limited bandwidth and scalability. The central bus becomes a bottleneck through which all traffic must flow. A very fast bus, for example, operates a 128-bit wide datapath at 50 MHz giving an aggregate bandwidth of 6.4 Gb/s, far short of the Terabits per second needed by a backbone switch. Also, the fan-out limitations of the bus interfaces limit the number of ports on a bus-based switch to typically no more than 32.
The bandwidth limitation of a bus may be overcome by using a crossbar switch. For N line interfaces, the switch contains N(N-1) crosspoints. Each line interface can select any of the other line interfaces as its input by connecting the two lines that meet at the appropriate crosspoint. To forward a packet with this organization, a line interface arbitrates for the required output line interface. When the request is granted, the appropriate crosspoint is closed and data is transmitted from the input module to the output module. Because the crossbar can simultaneously connect many inputs to many outputs, this organization provides many times the bandwidth of a bus-based switch.
Despite their increased bandwidth, crossbar-based routers still lack the scalability and bandwidth needed for an IP backbone router. The fan-out and fan-in required by the crossbar connection, where every input is connected to every output, limits the number of ports to typically no more than 32. This limited scalability also results in limited bandwidth. For example, a state-of-the-art crossbar might operate 32 different 32-bit channels simultaneously at 200 MHz giving a peak bandwidth of 200 Gb/s. This is still short of the bandwidth demanded by a backbone IP router.
FIG. 1 is a schematic block diagram illustrating a conventional packet switch (prior art). As noted in U.S. Pat. No. 6,275,491 (Prasad et al.), the architecture of conventional fast packet switches may be considered, at a high level, as a number of inter-communicating processing blocks. In this switch, ports PO through Pn are in communication with various nodes, which may be computers or other switches (not shown). Each of ports receive data over an incoming link, and transmits data over an outgoing link. Each of ports are coupled to switch fabric F, which effects the routing of a message from the one of input ports, to the one of n output ports associated with the downstream node on the path to the destination of the packet. The switch has sufficient capability to divide the packet into slices (when on the input end) and to reconstruct slices into a packet (when on the output end). Arbiter A is provided to control the queuing of packets into and out of switch fabric F, and to control the routing operation of switch fabric F accordingly.
While the high-level architecture of fast packet switches may be substantially common, different architectural approaches are used in the implementation of the fast packet switch. These approaches determine the location (input, output, or both) and depth of cell queues or buffers, and also the type of routing used within switch fabric. For example, one architecture may operate by the input ports forwarding each received cell immediately to switch fabric F, which transfers cells at its input interfaces to its output interfaces in a time-division multiplexed fashion; on the output side, each cell that is output from switch fabric F is appended to a FIFO queue at its addressed output port. Another architecture may utilize input queues at the input ports, with arbiter A controlling the order in which cells are applied from the input queues to switch fabric F, which operates in a crossbar mode. Another architecture may utilize both input and output queues at the input ports, with switch fabric F and arbiter A operating as a multistage interconnection network. These and other various architectures are known in the field of fast packet switching.
Also as is well known in the art, actual communication traffic is neither uniform nor independent; instead, real traffic is relatively bursty, particularly in the communication of data and compressed video. As such, traffic management algorithms are often utilized in fast packet switching to manage the operation of the switch and to optimize switch performance. Examples of well-known traffic management algorithms include traffic shaping, flow control, and scheduling.
As noted in U.S. Pat. No. 6,073,199 (Cohen et al.), arbiters are used in computer systems to control access to a common bus used by multiple devices. Arbiters typically use arbitration schemes such as fixed priority, round robin, or rotating priority. A fixed priority algorithm assigns a priority to each device on the bus and grants usage based upon the relative priority of the devices making the requests. The round robin scheme has a fixed order and grants bus usage based upon the requestor order and the current user of the bus. The rotating priority scheme changes the priority of requestors based on a fixed algorithm.
The goal of all arbitration schemes is to insure fair access to the shared resource, and to efficiently grant the resource to the correct requestor. The fixed priority scheme is unfair because a high priority requestor can consume all the shared resource, starving the lower priority requestors. The round robin scheme is inefficient because multiple clocks may be required to determine which requestor should be granted the resource. Also round robin schemes have a fixed grant pattern that can result in starvation of particular requestors if request patterns match the round robin grant pattern. Rotating priority schemes are random in their efficiency and fairness based on the algorithm chosen to update device priority.
As noted in U.S. Pat. No. 6,101,193 (Ohba), deficit round robin (DRR) uses packet queues provided in correspondence to flows in conjunction with an active list which holds flow IDs of (active) packet queues in which packets are currently queued and a counter which indicates a number of bytes that are currently transmittable by each flow.
Namely, in the DRR, for each flow from which a next packet could not have been outputted because a packet length of that next packet was greater than the counter value in the previous round, a number of bytes that were allowed to be outputted but not actually outputted in the previous round will be added to the counter value for the next round so that a number of bytes more than the weight can be outputted in the next round.
According to this DRR, a time required for the packet output unit to select the next output packet becomes constant regardless of the number of flows. In addition, it is possible to guarantee the maximum value of the delay when the flow input traffic obeys the reported traffic parameter, as well as the fairness in a time scale longer than one round. However, in the DRR, once the flow is selected, it will continue to select packets from the same flow until it becomes no longer possible to decrement the counter or the packet queue becomes empty, so that the burstiness for each flow becomes large in a time period shorter than one round, and the fairness characteristic will be degraded. This tendency becomes more prominent when the maximum packet length becomes longer, that is, when the minimum value of the weight becomes larger.
Thus the DRR, which is one example of the conventional weighted fair queuing algorithms, is associated with the problem that the fairness characteristic is degraded because the burstiness for each flow becomes large in a time period shorter than one round.
It would be advantageous if a scheduling algorithm could be devised for the efficient transfer of information packets having a variable length, or variable number of cells.
It would be advantageous if variable length information packets could be scheduled for transfer across a switch with a minimum of overhead devoted to the scheduling decision process.
It would be advantageous if a scheduling algorithm could be applied to switch scheduling that was both fair and efficient.
It would be advantageous if the DRR scheduling algorithm could be implemented in switch scheduling with a minimum of sequential processing.
It would be advantageous if information packet DRR priority decisions could be made simultaneously.