Networks are widely used to allow computers to communicate with one another. That is, a source computer uses a network to pass data to a destination computer. The data is typically divided into cells of information.
Networks originally evolved by relying upon a single link to communicate between computers. Examples of the single link architecture include Ethernet networks and Token Ring networks. The problem with a single link architecture is that it has limited data throughput.
In view of the bandwidth problems of a single link architecture, there is an increasing interest in arbitrary topology cell-based local area networks, such as Asynchronous Transfer Mode (ATM) networks. In these networks, computers are connected together by an arbitrary graph of communication links and switches. These networks provide improved bandwidth and fault tolerance.
To realize the potential advantages of arbitrary topology networks, a high performance switch is needed to take a cell arriving on an input link and quickly deliver it to the appropriate output link. A switch has three components: a physical switch, a scheduling mechanism to arbitrate when cells arrive on different inputs destined for the same output, and a queuing mechanism at inputs or outputs to hold those cells that lose the arbitration.
The queuing mechanism is used to store a set of incoming cells. Each incoming cell contains data and an identifier to indicate to which output it is destined. In the case of an ATM network, each cell has 48 bytes of data and 5 bytes of identification information. The physical switch can be implemented as a crossbar switch which allows any input to be routed to any output. As known in the art, a crossbar switch may be implemented as a matrix of transistors.
The construction of the physical switch and queuing mechanism is straightforward. However, the construction of a scheduling mechanism is rather complex and represents the performance bottleneck in most high throughput switches.
It is the job of the scheduling mechanism to identify a conflict-free match between inputs and the outputs of the switch. That is, the input cells at the input of the switch must be matched to particular outputs of the switch. In a "unicast" system, each input is connected to at most one output and each output is connected to at most one input. Parallel Iterative Matching (PIM) is a successful unicast scheduling technique that is described in U.S. Pat. No. 5,267,235, which is expressly incorporated by reference herein. PIM uses randomness to avoid starvation (a continuously unserviced input cell), and to reduce the number of iterations needed to converge on a maximal matching. PIM attempts to quickly converge on a conflict-free match in multiple iterations where each iteration consists of three phases. All inputs and outputs are initially unmatched and only those inputs and outputs not matched at the end of one iteration are eligible for matching in the next iteration.
A number of problems associated with the PIM technique were identified and solved in U.S. Pat. No. 5,500,858, which is expressly incorporated by reference herein. U.S. Pat. No. 5,500,858 describes a rotating priority iterative matching desynchronizing scheduler which improves the performance of the PIM system described in U.S. Pat. No. 5,267,235.
Both of these patents are directed toward unicast systems. Multicast systems potentially have higher bandwidth, but a fundamental design problem exists in relation to the scheduling of cells. In a multicast system, each input cell specifies one or more outputs, instead of a single output, as is the case in a unicast system. During a cell processing iteration, a multicast input cell may be sent to a number of its specified outputs, but other specified outputs may remain unserved. The unserved specified outputs are referred to as residue cells. The present invention is directed toward the positioning and processing of residue cells in an input-queued multicast switch.