1. Field of the Invention
The present invention is related to providing bandwidth and deterministic delay guarantees for data flows in a communications network. More particularly, the bandwidth and deterministic delay guarantees are provided in a crossbar switch.
2. Description of the Related Art
Digital Communications Systems
In digital communications systems, data is routinely transmitted between many processing devices over some sort of network. For example, in computer networks, data is typically sent from one computer to another computer through network communications devices such as hubs, routers, bridges and/or switches interconnected by transmission media or data links. Viewed from the outside, the network communications devices have input and output ports that send and receive data to and from the data links. Within a single network device, data is accepted at input ports, transferred across a switching fabric internal to the network device, and received at output ports for transmission onto the next data link.
There are generally four classes of data switching architectures implemented in network communication devices and these are classified based on the location of the buffers. These four main data switching architectures are classified as either output-buffered (OB), shared memory, input-buffered (IB), or as combined input-output buffered (CIOB) network devices.
In output-buffered and shared memory network devices, packets arriving at an input port are placed into an output buffer corresponding to an output port determined by an address of the packet. In the output-buffered switch, the buffers are allocated at the output port proper, whereas in the shared memory switch, the buffers are allocated in the switch fabric
Advantageously, output-buffered and shared memory network devices can use up to the full bandwidth of outbound data links because of the immediate forwarding of packets into output buffers. The packets are fed to the output data links as fast as the links can accept the packets. Also, output-buffered and shared memory network devices are typically considered very well suited for providing near-optimal throughput and delay performance.
A disadvantage of output-buffered and shared memory network devices is that when the switch size and link speeds increase, the switch fabric speed must increase proportionally in order to handle the combined data rates of all input ports being switched to a single output port. Also, memories used as output buffers to store packets must be very fast due to increased switch fabric speeds. Specifically, in both an output-buffered and shared memory network device having N input ports and receiving data at M bits per second, a data transmission rate of N * M is needed for the switch fabric to ensure that data is not lost. Similarly, the memory speed of the buffer system in both devices should also be as fast as N*M, since a buffer corresponding to an output port must be capable of accepting data from all inputs simultaneously. As the switch size and the link speeds increase, the cost of output-buffered and shared memory network devices also increases due to the costs inherent in the high speed memory requirements. Thus, current output-buffered and shared memory network devices are limited in size by memory, speed, technology and cost.
These issues have generated renewed interest in switches with lower cost, such as input-buffered switches. One of the most popular interconnection networks for building non-blocking input-buffered switches is the crossbar. An input-buffered crossbar with speedup of one has the crossbar fabric running at a speed equal to the link rate. This implies that in a crossbar switch with speedup of one, at most one packet can leave a given input port at a given time, and at most one packet can enter any output at any given time. All buffering in such a crossbar is located at the input ports of the switch. If each input port maintains a single FIFO queue, however, packets suffer from head of line (HOL) blocking. This limits the maximum throughput achievable. To eliminate HOL blocking, virtual output queues (VOQs) have been proposed. Inputs ports with VOQs have a bank of queues, with one queue per output port. Packets are stored in random access buffers at the input ports. In practice, however, only pointers to the data need to be stored in the respective VOQs.
Since there could be contention at the input and output ports if more than one input port has data for the same output port, there is a necessity for an arbitration algorithm to schedule packets between various input and output ports. A paper by N. McKeown, V. Anantharam and J. Warland, entitled “Achieving 100% Throughput in an Input-Queued Switch,” Proc. INFOCOM, March 1996, pp. 296-302, showed that an input-buffered network device with VOQs supposedly can provide 100% throughput using a weighted maximum bipartite matching algorithm (defined therein). However, the complexity of the best known weighted maximum matching algorithm is too high for a high speed implementation.
Over the years, a number of maximal matching algorithms have been proposed. Details of these algorithms and the definition of maximal matching may be had with reference to the following papers: T. Anderson, S. Owicki, J. Saxe, C. Thacker, “High Speed Switch Scheduling for Local Area Networks,” Proc. Fifth Intl. Conf. On Architectural Support for Programming Languages and Operating Systems, October 1992, pp. 98-110; and N. McKeown, “Scheduling Algorithms for Input-Queued Cell Switches,” Ph.D. Thesis, Univ. of California, Berkeley, May 1995. However, none of the disclosed algorithms matches the performance of an output-buffered network device.
Increasing the speedup of the switch fabric has also been proposed as one of the ways to improve the performance of an input-buffered switch. However, when the switch fabric has a higher bandwidth than the links, buffering is required at the output ports also. Thus, a combination input-buffered and output-buffered network device is required—a CIOB network device (Combined Input and Output Buffered). It has been shown that a CIOB switch is more suitable for providing throughput and delay guarantees than an input-buffered crossbar switch without speedup.
Integrated Services Networks
In the field of Integrated Services Networks, the importance of maintaining Quality of Service (QoS) guarantees for individual traffic streams (or flows) is generally recognized. Thus, such capability continues to be the subject of much research and development. Of particular interest for a system providing guaranteed flows are the guarantees associated with bandwidth and delay properties. These guarantees must be provided to all flows abiding by their service contract terms negotiated at connection setup, even in the presence of other misbehaving flows, i.e., those flows not abiding by their service contract terms.
Different methods have been developed to provide such guarantees in non-blocking switch architectures such as output-buffered or shared memory switches. Several algorithms providing a wide range of delay guarantees for non-blocking architectures have been disclosed in the literature. See, for example, A. Parekh, “A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks”, MIT, Ph.D. dissertation, June 1994; J. Bennett and H. Zhang, “WF2Q—Worst-case Fair Weighted Fair Queuing”, Proc. IEEE INFOCOM'96; D. Stiliadis and A. Varma, “Frame-Based Fair Queuing: A New Traffic Scheduling Algorithm for Packet Switch Networks”, Proc. IEEE INFOCOM '96; L. Zhang, “A New Architecture for Packet Switched Network Protocols,” Massachusetts Institute of Technology, Ph.D. Dissertation, July 1989; and A. Charny, “Hierarchical Relative Error Scheduler: An Efficient Traffic Shaper for Packet Switching Networks,” Proc. NOSSDAV '97, May 1997, pp. 283-294.
Schedulers capable of providing bandwidth and delay guarantees in non-blocking architectures are commonly referred to as “QoS-capable schedulers”.
Typically, as described above, output-buffered or shared memory non-blocking architectures require the existence of high-speed memory. For example, an output-buffered switch requires that the speed of the memory at each output must be equal to the total speed of all inputs. Unfortunately, the memory speed available with current technology has not kept pace with the rapid growth in demand for providing large-scale integrated services networks. Because there is a growing demand for large switches with total input capacity on the order of tens and hundreds of Gb/s, building an output-buffered switch at this speed has become a daunting task given the present state of memory technology. Similar issues arise with shared memory switches as well.
However, even given the work already done, providing bandwidth and delays in an input-queued crossbar switch remains a significant challenge.
N. McKeown, V. Anantharam and J. Warland, in “Achieving 100% Throughput in an Input-Queued Switch,” cited above, describe several algorithms based on weighted maximum bipartite matching (defined therein) and which are supposedly capable of providing 100% throughput in an input-buffered switch. Unfortunately, the algorithms described there are too complex for real-time implementations and the nature of the delay guarantees provided by these algorithms remains largely unknown.
D. Stiliadis and A. Varma, “Providing Bandwidth Guarantees in an Input-Buffered Crossbar Switch,” Proc. IEEE INFOCOM '95, April 1995, pp. 960-968, suggest that bandwidth guarantees in an input-buffered crossbar switch may be realized using an algorithm referred to as Weighted Probabilistic Iterative Matching (WPIM), which is essentially a weighted version of the algorithm described in Anderson et al. Although the WPIM algorithm is more suitable for hardware implementations than that described by McKeown et. al., it does not appear to provide bandwidth guarantees.
One known method of providing bandwidth and delay guarantees in an input-buffered crossbar architecture uses statically computed schedule tables, an example of which is described in Anderson et al. There are, however, several significant limitations associated with this approach. First, the computation of schedule tables is extremely complex and time-consuming. Therefore, it can only be performed at connection-setup time. Adding a new flow or changing the rates of the existing flows is quite difficult and time-consuming, since such modifications can require re-computation of the whole table. Without such re-computation, it is frequently impossible to provide delay and even bandwidth guarantees even for a feasible rate assignment. Consequently, these table updates tend to be performed less frequently than may be desired. Second, per-packet delay guarantees of the existing flows can be temporarily violated due to such re-computation. Third, there exists the necessity to constrain the supported rates to a rather coarse rate granularity and to restrict the smallest supported rate in order to limit the size of the schedule table. All of these limitations serve to substantially reduce the flexibility of providing QoS in this approach.
Therefore, at the current time no satisfactory method for providing flexible bandwidth and delay guarantees in a crossbar switch with speedup of one are known.
As mentioned above, recently a number of studies demonstrated that increasing the speedup factor in a crossbar switch (thus making it a CIOB switch) may allow providing better throughput and delay guarantees.
In one approach, several algorithms for the emulation of a non-blocking output-buffered switch by using an input-buffered crossbar with speedup independent of the size of the switch have been developed. Emulation of an output-buffered switch with a CIOB switch means that given identical input traffic patterns, the two switches produce identical output traffic patterns. The first such algorithm, called MUCFA for “Most Urgent Cell First Algorithm”, which emulates an output-buffered switch with a single FIFO queue at the output, using a CIOB switch with speedup of four was described in B. Prabhakar and N. McKeown, “On the Speedup Required for Combined Input and Output Queued Switching,” Computer Systems Lab. Technical Report CSL-TR-97-738, Stanford University. The MUCFA arbitration algorithm requires the assignment of priorities to cells as they enter the virtual output queues of input buffers at each input port. Generally, MUCFA selects the cells with the highest urgency, typically oldest, for connections to output ports first, hence the name “most urgent cell first”. The MUCFA algorithm is difficult to implement in practice due to the maintenance required in assigning and updating the priorities of each cell queued at the input ports.
However, none of the algorithms, discussed above, that are emulating an output-buffered switch with FIFO at the output are capable of providing bandwidth and delay guarantees for flows abiding to their contracted bandwidth in the presence of misbehaved flows. This is due to the fact that the output-buffered switch with a FIFO at the output by itself is not capable of providing such guarantees. Hence, additional mechanisms are required to provide such guarantees.
One approach to achieve such guarantees is to attempt to emulate an output-buffered switch with some QoS-capable queuing and scheduling mechanism at the output ports with a CIOB switch. For example, as described above, an output-buffered switch with a WFQ scheduler at the output (and per-flow queues implied by the WFQ scheduler) is known to provide high-quality bandwidth and delay guarantees.
It was shown recently that it is theoretically possible to emulate an output-buffered switch with a broad class of schedulers at the output, including WFQ. This means that, in principle, it is possible to provide the same bandwidth and delay guarantees in a crossbar switch as in the output-buffered switches with a WFQ scheduler. Unfortunately, the algorithm described in this work is very complex and therefore is very difficult to implement in practice. No implementable algorithms achieving such emulation are currently known.
Another approach is to provide bandwidth and delay guarantees in a CIOB switch without emulating any output-buffered switch at all. Several implementable methods for providing bandwidth and delay guarantees in crossbar switches with speedup have been described. While these algorithms ensure bandwidth and delay guarantees, they do not have the work-conserving property, defined as follows: the switch is work-conserving if the output port is never idle when there is at least one packet in the switch destined to this output port. The work-conserving property is useful because it ensures that each output operates at its full capacity, and therefore no bandwidth is wasted. In particular, an output-buffered switch with a FIFO queue is work-conserving. This implies that any CIOB switch emulating an output-buffered switch is also work-conserving. It follows that MUCFA provides the work-conserving property.
Thus there is a need for simple work-conserving algorithms that will provide bandwidth and delay guarantees as well.