In a network transfer apparatus such as a router, in a server, and in a storage unit for connecting a plurality of disk arrays, a switch fabric for switching data among the functional blocks in itself is used. The switching capacity of such a switch fabric is represented with a product of the number of ports and the port capacity (line speed). And to realize a large switching capacity, either of the number of ports or the port capacity or both of them must be increased.
And in order to increase the number of ports, element switches are connected in multiple steps to form an omega network, cross network, fat tree network, or the like. Furthermore, the port capacity of a switch LSI (Large Scale Integration) can also be increased to increase the port capacity. However, in this case, the number of connectable pins on such an LSI is limited by the mounting capacity of the CMOS (Complementary MOS) in that age. If a large capacity port is realized, therefore, the number of ports per switch LSI decreases.
Although it is possible here to improve the total switching capacity of a switch fabric by connecting a plurality of switch LSIs having a few large capacity ports in multiple steps, the number of connecting steps increases in proportion to the increase of the number of ports, thereby the latency passing through the switch fabric increases and the throughput of the switch fabric is lowered due to the conflictions that occur between cells therein even when their destinations are different. Those have been problems. And a multi-plane switch (parallel packet switch) is known well as one of the methods for avoiding such problems and realizing a large capacity as described above.
In case of such a multi-plane cell switch, a plurality of comparatively low speed switch LSIs (“M” LSIs) provided with a port having a 1/M of the required capacity respectively are prepared and each input data is divided so as to be distributed to the switches at its distribution part that functions as an input of the switch fabric, then passed through those switches in a dispersed manner to realize a desired large switching capacity. Generally, in case of a switch used in a network apparatus, input data is variable length packets and each packet is divided into fixed length cells.
In case of a most simply configured multi-plane cell switch, it is required to synchronize a plurality of switch LSIs used for its switching units and furthermore to make arbitrations perfectly among cells addressed to the same destination. Consequently, cells come to arrive at the destination in the preset order at a predictable timing respectively. This is why packets can be restored easily and the order of those packets in each flow is also restored easily.
In recent years, however, the port capacity and the switching capacity required for switches are expanded significantly and the speed of each switch LSI itself used for such multi-plane switches is improved. For example, high speed serial transmission referred to as SerDes (SERialization/DE-Serialization) is employed for the communications between LSIs and cells' switching pitch is also shortened. It is thus impossible actually to synchronize those switches with each another completely. This is why there have been demanded a multi-plane cell switch in which each switch functions asynchronously with others, that is, makes destination arbitration independently.
In case of such a multi-plane cell switch in which each switching unit functions asynchronously with others, it is not assured that the sending order and the arriving order match between the distribution unit that functions as an input of the switch fabric and the reordering unit that functions as an output of the switch fabric. Therefore, cells in a packet (flow) sent from the same source to the same destination must be reordered just as they were before so as to restore the original packet (flow) respectively (packet restoration).
The document US2004/0143593 (A1) discloses a method for restoring an original order of packets by storing those packets until they are collected enough for forming their flow at the destination with use of the sequence number, source number, routing index (a value for referring to a single destination or a combination of a plurality of destinations respectively), and priority level of each packet. The method disclosed in this document, however, is expected to require a mechanism for holding packets enough for forming the number of flows represented with a product of the source number, routing index, and priority level respectively, thereby resulting in an increase of the number of hardware items. This has been a problem.
On the other hand, the document WO02/43329 (A1) discloses a method for restoring an original order of cells/packets in an first-in first-out order of the time stamps in each flow at the subject switch destination by assigning the destination number, source number, cell division number, as well as the same time stamp to each cell generated from the same packet, then by selecting older time stamps preferentially in the switch with use of a common watch among the units of the switch fabric. However, the method disclosed in this document is also expected to require a mechanism for holding packets enough for forming the number of flows represented with a product of the source number and the routing index respectively, thereby resulting in an increase of the hardware items. This has been a problem. Furthermore, the method for using a watch commonly among the units of the switch fabric to sort cells/packets in accordance with the watch time is becoming difficult more and more as the transfer speed of packets and cells is improved. This has also been a problem.
The document U.S. Pat. No. 6,832,261 (B1) also discloses a method for restoring an order of cells/packets through communications among a plurality of prepared packet reordering devices. As shown in an embodiment of the U.S. Pat. No. 6,832,261 (B1), the method is also expected to require a mechanism for holding packets enough for forming the number of flows represented with a product of the sequence number and the destination slot (destination numbers) respectively, resulting in an increase of the hardware items. And this has been a problem.
Each of the documents US2004/0143593 (A1) and WO02/43329 (A1) discloses operations of distribution executed by a distribution unit that functions as an input of the switch fabric while keeping the load balance in accordance with the load of each switch, although the document U.S. Pat. No. 6,832,261 (B1) does not describe anything about it clearly, since it aims mainly at the processings of the reordering unit that functions an output of the switch fabric. Generally, however, upon executing a simple load balancing operation, before cells and packets accumulated in a congested switch arrive at their destination, that is, a reordering unit that functions as an output of the switch fabric, subsequent cells and packets might be passed through another non-congested switch to arrive in their destination. And in order to avoid such a problem, the reordering unit is required to hold a mass of cells and packets to restore an original order of those cells and packets, thereby resulting in an increase of the hardware items. This has been a problem.
As described above, in any of conventional multi-plane cell switches in which each switching unit functions asynchronously with others respectively, the original order of flows/packets is restored in the reordering unit, but the method has been confronted with a problem that the method has caused its hardware quantity to increase. And those multi-plane cell switches have been intended originally to improve the switching capacity of the entire switch, so that it is conceivable that the hardware items are mounted sufficiently to prevent lowering of the switching capacity that might otherwise be caused by the distributing and restoring processings of the switch itself. And in order to suppress the manufacturing cost of the apparatus, it is desirable that such a multi-plane cell switch is provided with a reordering unit that can be realized with hardware in a smaller scale.    [Patent document 1]US2004/0143593 (A1)    [Patent document 2]WO02/43329 (A1)    [Patent document 3]U.S. Pat. No. 6,832,261 (B1)