Multiprocessor systems require some form of connectivity to exchange data and control messages. This information can be transferred between cells through the use of crossbars or interconnect fabric in several different ways. One method involves the transfer of the information sequentially from one cell to a second cell via a crossbar providing a single path through its connectivity fabric at a time. This methodology of making a single path available provides a low interconnect bandwidth through the crossbar. Alternatively, if more than one connection or crossbar connection between the cells is available the time required to transfer the information between two cells can be reduced by bit-slicing the data to be transferred. Bit-slicing basically breaks the data up into various packets and these packets are routed between one of the two interconnections between the cells. For instance, if two crossbars are available, the information to be transferred between cell 1 and cell 2 can be broken up into two different packets. Packet 1 can traverse from cell 1 to cell 2 via crossbar A and Packet 2 can traverse from cell 1 to cell 2 via crossbar B. The use of bit-slicing decreases the amount of time necessary for the information to be transferred from cell 1 to cell 2 and thus increases the interconnect bandwidth.
In order to ensure the proper information is sent, the messages sent via crossbar A and crossbar B must be appropriately recombined in the proper sequence within cell 2. One method to assure that this information is combined correctly within cell 2, is to have crossbar A and crossbar B in lock step while they transfer the information between cell 1 to cell 2. This lock step between the crossbars assures the two messages are synchronized upon arrival at cell 2. Problems arise, moreover, if an error occurs in either crossbar A or crossbar B during this lock step or synchronization.
When an error is present, and the synchronization between crossbar A and crossbar B is destroyed, so that the synchronization must be reacquired prior to the subsequent transfer of data. One method of reacquiring synchronization between crossbar A and crossbar B is to require a full chip reset to reacquire the synchronization. A full chip reset would terminate all transactions in progress throughout the affected crossbars. Further, resetting the crossbar may disrupt other connections through the device.
A second method for reacquiring the synchronization is through the use of dedicated pins on crossbar A and crossbar B. By supplying a reset to the dedicated pins at the same time to both crossbar A and crossbar B synchronization can be reacquired. However, the use of dedicated pins impacts both chip pin count and the performance in terms of latency. This latency impact is a result of the time required to exchange information between the two crossbars. Further, resetting the crossbar may disrupt other connections through the device.
When cells attempt to send multiple messages to another cell at approximately the same time, a crossbar must determine the order of the messages which are accepted and sent. Typically crossbars implement an arbitration algorithm which selects among the messages which compete with each other for access to system resources according to resource availability. Ports are used to connect cells to the crossbar, and the arbitration algorithm helps to determine the sequence of messages received through a specific port. One example of an arbitration algorithm uses the history of previous arbitrations to decide which messages should be sent next, i.e., which messages receive priority service. Arbitrations which depend on previous decisions present their own special difficulties. When resynchronization affects arbitration algorithms which normally depend on previously made decisions, priorities are lost when previous history is also reset and erased. Alternative algorithms may present fairness complexities.
Accordingly, a need exists for a method and system which provides the transfer of data from cell 1 to cell 2 with a high interconnect bandwidth. A further need exists for a method and system that reacquires synchronization on a portion of the crossbar or intercommunications fabric that has lost synchronization between complimenting crossbars, without requiring resynchronization of every port on the crossbar. A further need exists which will allow resynchronization among crossbar elements that use arbitration algorithms.
These and other objects, features and technical advantages are achieved by a system and method which according to an aspect of the invention, a data switch is configured to communicate data messages in the form of multibit data units segmented into a plurality of multibit data subunits. The data switch includes at least two separate, parallel switching units, each having a plurality of ports to communicate the multibit data subunits. Hardwired or software implemented prioritization logic provides for the initiations of transfer of data messages between the ports in response to a category of the data messages. A memory is used to store a history of prior data message transfers so that least recently transferred message types are serviced prior to those most recently switched. So as to reestablish synchronization between the parallel switching units, a controller responds to a reset condition by temporarily suspending communications between affected ones of the ports and clearing the history to recommence a lock-step operation of the units.
According to another aspect of the invention, a processing system includes plural processing cells, each including a controller configured to communicate data messages in discrete, multibit data units. Each of the data units are segmented into plural multibit data subunits for parallel transmission to and from the controller over respective distinct transmission paths. The controller recombines the received data subunits back into a corresponding one of the multibit data units and identities any error condition caused by a nonreceipt of one of the subunits required to complete a corresponding one of the multibit data units. A switch includes two or more switching units preferably in the form of crossbars. Each switching unit has a plurality of ports connected by the distinct transmission paths to respective ones of the controllers to communicate the multibit data subunits with the processing cells. The switch responds to the error condition by initiating a resynchronization procedure to reestablish a lock-step condition between and among the switching units. This resynchronization procedure may include halting communications between affected ones of the ports and resetting a portion of the history relating to the affected ports.
According to a feature of the invention, the switch includes message prioritization logic to initiate a transfer of data messages between the ports in response to a category of the data messages. The prioritization logic responds to an order of message categories transfers completed for initiating a transfer of message categories least recently transferred prior to initiating a transfer of message categories most recently transferred. The switch may include a memory configured to maintain a history of the data messages, the message prioritization logic responsive to the history for initiation of the transfer of data messages between the ports. This history may include an indication of the categories of the data messages most recently transferred through the switch.
According to another feature of the invention, the prioritization logic is configured to initiate a transfer of a data message category least recently transferred prior to a data message category more recently transferred by the switch.
According to another feature of the invention, the ports corresponding to a respective one of the processing cells include error processing logic for detecting an error in the data messages. The logic may detect such conditions as a missing or incorrect subunit, and incomplete or invalid data unit, parity errors, category mismatches, etc.
According to another aspect of the invention, a method of communicating data messages segments multibit data units into a plurality of multibit data subunits. The data messages are categorized and a history of data message transfer is maintained. The history may be based on recency of transfer of the messages of particular categories. The method includes initiation of a transfer of the multibit data subunits over separate, parallel paths. In response to identification of an error, such as loss of one of the multibit data subunits, appropriate action is taken to desynchronize the data subunits. These actions may include suspending or inhibiting transfer of certain messages and resetting the history. Once reset, data transfers between affected ports may be reenabled so that communications may be continued with the data subunits in lock-step.
According to a feature of the invention, the step of initiating includes controlling a switch fabric to connect source and destination devices for communicating the multibit data subunits over separate parallel paths.
According to another feature of the invention, a method further includes a step of ordering the categories of data messages transferred so that initiation of a transfer of message categories least recently transferred is performed prior to initiating a transfer of message categories most recently transferred.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.