Networks are widely used to allow computers to communicate with one another. That is, a source computer uses a network to pass data to a destination computer. The data is typically divided into cells of information.
Networks originally evolved by relying upon a single link to communicate between computers. Examples of the single link architecture include Ethernet networks and Token Ring networks. The problem with a single link architecture is that it has limited data throughput.
In view of the bandwidth problems of a single link architecture, there is an increasing interest in arbitrary topology cell-based local area networks, such as Asynchronous Transfer Mode (ATM) networks. In these networks, computers are connected together by an arbitrary graph of communication links and switches. Arbitrary topology networks offer a number of potential advantages, including: (1) aggregate throughput that can be much larger than that of a single link, (2) the ability to add throughput incrementally as the workload changes by simply adding extra links and switches, (3) improved fault tolerance by allowing redundant paths between hosts, and (4) reduced latency because control over the entire network is not needed to insert data.
To realize the potential advantages of arbitrary topology networks, a high performance switch is needed to take a cell arriving on an input link and quickly deliver it to the appropriate output link. A switch has three components: a physical switch, a scheduling mechanism to arbitrate when cells arrive on different inputs destined for the same output, and a queuing mechanism at inputs or outputs to hold those cells that lose the arbitration.
The queuing mechanism is used to store a set of incoming cells. Each incoming cell contains data and an identifier that indicates which output it is destined for. In the case of an ATM network, each cell has 48 bytes of data and 5 bytes of identification information. The physical switch can be implemented as a crossbar switch which allows any input to be routed to any output. As known in the art, a crossbar switch may be implemented as a matrix of transistors.
The construction of the physical switch and queuing mechanism is straightforward. However, the construction of a scheduling mechanism is rather complex and represents the performance bottleneck in most high throughput switches.
It is the job of the scheduling mechanism to identify a conflict-free match between inputs and the outputs of the switch. That is, the input cells at the input of the switch must be matched to particular outputs of the switch. Each input is connected to at most one output and each output is connected to at most one input.
Ideally, the scheduling mechanism finds the maximum number of matches between inputs and outputs. The problem with a maximum matching approach is that it is computationally expensive. In addition, such an approach can result in a continuously unserviced input-output connection under certain traffic patterns. Thus, techniques have been developed to achieve a maximal, not a maximum match. These algorithms iteratively add connections to fill in the missing connections left by a previous iteration. Because the connections made in previous iterations may not be removed, this technique does not always lead to a maximum match. However, it is possible to achieve a close approximation to the maximum for many traffic patterns.
Parallel Iterative Matching (PIM) is a successful scheduling technique that is described in U.S. Pat. No. 5,267,235, which is expressly incorporated by reference herein. PIM uses randomness to avoid starvation (a continuously unserviced input cell), and to reduce the number of iterations needed to converge on a maximal matching. PIM attempts to quickly converge on a conflict-free match in multiple iterations where each iteration consists of three phases. All inputs and outputs are initially unmatched and only those inputs and outputs not matched at the end of one iteration are eligible for matching in the next iteration. The three phases of each iteration operate in parallel on each output and input and are as follows:
1. Request. Each unmatched input sends a request to every output for which it has a queued cell. PA1 2. Grant. If an unmatched output receives any requests, it grants to only one by randomly selecting a request uniformly over all requests. PA1 3. Accept. If an input receives a grant, it accepts one by selecting an output among those that granted to this output.
Note that in phase (2) of the PIM scheduling technique the independent output schedulers randomly select a request among contending requests. This has three effects: first it can be shown that each iteration will match or eliminate on average at least 3/4 of the remaining possible connections and thus the algorithm will converge to a maximal matching in O(log N) iterations. Second, it ensures that all requests will eventually be granted. As a result, no input queue remains continuously unserved. Third, it means that no memory or state is used to keep track of how recently a connection was made in the past. At the beginning of each cell time, the match begins over, independently of the matches that were made in previous cell times. Not only does this simplify understanding of the algorithm, it also makes analysis of the performance straightforward since there is no time-varying state to consider, except for the occupancy of the input queues.
Unfortunately, there are a number of shortcomings associated with the PIM technique. First, it is difficult and expensive to implement at high speed because each scheduler must make a random selection among the members of a varying input set. The varying input set must be ordered in such a manner that the random selection process is successful. For example, suppose an input vector can accept ten requests and there is a request at positions 3, 7, and 10. A random selection process may specify positions 1, 2, 4, 5, 6, 8, and 9 before a hit is made at a requesting position. In view of this situation, the random selection process must be tailored according to the varying selection set so that the random value selected corresponds to a filled position. This ordering of data results in substantial computational overhead. Thus, it would be highly desirable to provide a scheduling technique with reduced computational overhead.
Another problem with the PIM technique is that it does not perform well for a single iteration. Although the PIM technique will often converge to a good match after several iterations, this required convergence time affects the rate at which the switch can operate. It would be preferable to provide a technique that converges in a single iteration.