FIG. 1 illustrates a data transmitting portion of a serial input/output (I/O) circuit 10. In source synchronous interfaces, a clock signal (TC) is transmitted along with a group of parallel data from a data source 12. This clock signal is used to latch the group of data in a circuit 14 that receives the data. The receiving circuit 14 typically includes a serializer, for example, such as a quad serializer receiving 20-bit parallel data. A quad serializer performs parallel-to-serial conversion and outputs the serial data to four serial data lane links, as shown in FIG. 1. The receiving circuit 14 which references to the clock signal forwarded with a specific group of data is typically implemented in a single circuitry unit or chip.
As the bus width become wider to achieve faster data rate, more data link connections are required. For example, in channel-based point-to-point connections such as an Infiniband application, a word is encoded and sent out through up to 12 different channels/lanes, and thus it is required to provide a 12-lane high speed serial link (current chip implementations handle up to 4 lanes which provides up to 4×3.125 Gigabits/sec for Ethernet). In order to provide such a multiple data lane connection, a plurality of I/O circuits, typically transceivers, should be ganged together.
The specification of multi-link connections typically includes requirements for acceptable skew at the serial data outputs across multiple I/O circuits. For example, in the Infinivand application described above, the delay skew across all 12 lanes at the serial outputs has to be 500 picosecond (ps) or less according to the current electrical specification. During the serialization, the I/O circuits are also required to align all incoming parallel data to within the same byte/cycle.
However, when two or more I/O circuits or chips which are referencing to different forwarded clocks are bundled, the data coming out of the different circuits/chips are not necessarily in sync. Due to the timing skew caused by a number of variations including process, temperatures, voltages, and board traces skew, synchronization of these output data across the multiple circuits/chips becomes a difficult task. Such a multiple-circuit system using different local clocks also puts a significant limitation on chips placement and board routing in order to minimize the skew across all different clocks. Thus, ganging a plurality of I/O circuit/chips without any special circuit techniques will result in large and out-of-spec data skews between different circuits, making the system unusable.
Accordingly, it would be desirable to provide a scheme for synchronizing a multiple-circuit system such as a system including a plurality of I/O circuitry units, and for controlling data skew across multiple circuits.