1. Field of the Invention
The invention relates to switching fabrics, and more particularly to a flow control method and apparatus for fairly allocating the bandwidth capacity of the fabric to multiple input flows.
2. References
The following U.S. patents are incorporated herein by reference in their entirety: U.S. Pat. Nos. 6,388,992; 5,903,735; 5,280,470; 5,930,234; and 5,455,826.
3. Description of Related Art
A switch fabric for a data network is a device that allows data from any of several input ports to be communicated switchably to any of several output ports. Early data networks were based on circuit switching, in which fixed routes were established through the fabric for each session. The peak bandwidth demand of each session was allocated to the route for the entire duration of the session. When session traffic was bursty, however, circuit switching resulted in under-utilization of network resources during the time between bursts. Packet switching was developed to overcome this disadvantage, thus improving the network utilization for bursty traffic.
Packet switched networks dynamically allocate bandwidth according to demand. By segmenting the input flow of information into units called “packets,” and processing each packet as a self-contained unit, packet switched networks allow scheduling of network resources on a per-packet basis. This enables multiple sessions to share the fabric resources dynamically by allowing their packets to be interleaved across the fabric. Typically each packet includes a header indicating its destination port, and the fabric includes a routing mechanism for determining a route through the fabric, on a per-packet basis.
Small switching fabrics can be constructed from crossbar switches, in which a scheduler configures the crossbar in a each time slot to connect a set of input ports to set of output ports in a one to one mapping (or one to many in the multicast case). The crossbar scheduler decides what the mapping between inputs and outputs should be in a particular time slot. Both input port constraints and output port constraints are taken into account by the scheduler: input ports contending for the same output port share the bandwidth of that output port in some manner that the crossbar scheduling algorithm considers fair, and output port usage is optimized by attempting to have every output port carry a cell in each time slot for which any input port has a cell destined for that output port.
Crossbar scheduling algorithms are often designed to mimic maximally weighted matching, a discipline that tries to maximize a particular objective function in each time slot by making more of an effort to grant requests from inputs which have longer queues. The strict maximally weighted matching discipline is a very hard problem, though, so crossbar scheduling algorithms have been devised that only approximate maximally weighted matching. In one such crossbar scheduling algorithm, known as ISLIP, inputs keep virtual output queues (VOQs) in which cells bound for different output ports wait at the inputs for transfer. Before each time slot begins, each input submits to a central scheduler or arbiter a request to connect with a certain output. Part of the ISLIP algorithm is a procedure for the input to decide which output to request being connected to, among all of the outputs that the input has cells to send to. The central scheduler collects all of the requests, and chooses which to grant. There can be several rounds of requests and grants so that an input that did not get its first request granted can submit a request for connection to a different output to which it has data to send.
It can be seen that a crossbar scheduler needs to be centralized, both in the sense that it needs to coordinate the functioning of all input ports and in the sense that it needs to coordinate the functioning of all output ports. It needs to coordinate all the input ports because it needs to ensure that no more than one input port is transmitting a cell to the same output port in the same time slot; crossbar fabrics (at least pure ones) cannot tolerate such a situation. It needs to coordinate all output ports in order to maximize utilization of the bandwidth capacity of each one. In short, a crossbar scheduler needs to be aware of the global state of the overall system in each time slot in order to do its job well.
The centralized nature of crossbar schedulers tends to prevent the scaling of crossbar switching fabrics to large numbers of inputs and outputs. At least two factors limit the scalability of crossbar switching fabrics. First, as the number of input and output ports grow, and channel data rates increase, it becomes increasingly difficult to design logic circuitry that is fast enough to make all the required calculations in time for each time slot. Since the crossbar scheduler must be centrally implemented, its complexity usually increases more than linearly with the number of input and output ports that the algorithm is required to handle. Second, as the number of input and output ports grows, per-chip or per-module pin count limitations begin to require the spreading of the fabric over more than one chip or module. In this case it also becomes increasingly difficult to design in sufficient control signal capacity to transmit the requests and grants between the centralized scheduler and all the various inputs.
As the need for larger and larger switching fabrics increases, therefore, it becomes more and more important to find a mechanism for controlling the transmission of data packets from switch inputs to switch outputs, in a manner that scales much more easily to larger fabrics.