The present invention relates generally to the field of digital data communications and more specifically to low-latency serializers/deserializers.
Data exchange between different points of communication (e.g.: computers, processors, memory units, etc.) through a serial link may use a multiplexing operation performed by a multiplexer (MUX) at a transmitter side followed by an inverse demultiplexing operation performed be a demultiplexer (DMUX) at a receiver side.
The MUX converts N-bit wide parallel data words di={d(i)1, d(i)2, . . . d(i)N} with bit rate B into a serial bit stream ds=( . . . ds(i−1)N, ds(i)1, ds(i)2, . . . ds(i)N, ds(i+1)1, . . . ) with rate NB under control of timing signals sel1, sel2, . . . selK generated by a transmitter circuitry. For a binary MUX with a tree architecture, the relation between the parallel word width and the number of timing signals is given by equation N=2K, the signal sel1 is equal to the line-rate clock, and the signal sel(j+1) is derived from the signal selj by “frequency divide-by-2” operation. For a MUX with shift-register architecture, two timing signals are used: a line-rate (or divided-by-2) clock and a bit-wide signal WE with divided-by-K frequency, where again N=2K. In any case, the phase relation between the parallel and serial data words is arbitrary depending on the timing signal initial settings.
The DMUX converts the serial data stream back into N-bit parallel words under control of the same set of timing signals but generated by a receiver circuitry, and thus having no phase correlation with the transmitter timing signals. This is the reason for the well-known ambiguity of demultiplexing operation, which can be illustrated by an example of a 4-bit transmission system with a tree architecture.
In the case of this simplified system, the MUX output signal is defined by a logic function ds=d1·sel2TR·sel1TR+d2· sel2TR·sel1TR+d3·sel2TR· sel1TR+d4· sel2TR· sel1TR, where the top index TR indicates the transmitter. This signal when processed by the DMUX will deliver the first parallel bit value defined by a logic function dp=ds·sel2RCsel1RC, where the index RC corresponds to the receiver. One skilled in the art can see that dp=d1 if and only if both sel1RC=sel1TR and sel2RC=sel2TR, which is impossible without a strict transmitter/receiver synchronization. In the case of shift-register architecture, the corresponding condition is WERC=WETR. The lack of synchronization results in the equal probability 1/N of getting any bit of the word at the first output of the DMUX.
In existing data communication systems these conditions are satisfied by a framing operation, which inserts some redundant bits for marking a word position in the out-coming bit stream. The presence of extra bits increases the system latency and results in higher transmission frequency requirements. Various transmission protocols, such as Infiniband, 3GIO, Gigabit Ethernet, SONET/SDH, etc. have been devised to establish the synchronization, but all of them require additional expensive circuitry and/or software for operation.