1. Field of the Invention
The present invention relates generally to asynchronous on-chip communication, and more specifically, to a two-phase, return-to-zero (RZ) asynchronous transceiver for on-chip interconnects.
2. Description of the Related Art
Current integrated circuits (i.e., chips) not only feature multiple clock domains but also integrate a wide range of blocks (IPs) with various data communication needs and patterns. In addition, due to consumer demands, these designs have very short time-to-market demands. This requires efficient design flows that can achieve time closure of the whole chip in short times. As a result of these requirements two main new design paradigms have emerged to satisfy the communication needs of these chips while enabling a reasonable timing closure of the complete design: Network-on-chips (NoCs) and Globally Asynchronous Locally Synchronous (GALS) systems.
NoCs research aims at developing scalable interconnect architectures that can provide means for routing data between System on Chip (SoC) 1 Ps with minimum latency over shared interconnects. While research on GALS aims at developing circuits, methodologies and models for interconnecting synchronous blocks with separate clock domains using asynchronous interconnects. Hence NoCs can be viewed as a special case of GALS. In any case, both share the common problem of designing the point-to-point interconnect circuitry (repeaters, buffers, and pipeline stages) between routers and/or IP blocks. Hence developing high performance robust interconnect circuitry is essential for current and future chip designs.
GALS are categorized into pause-able clock GALS, asynchronous GALS, and loosely synchronous GALS, based on their communication schemes. Pause-able clock systems stop (or pause) the clock of the IP block during data transfer. This goes against the fundamental concept of decoupling ‘computations’ from ‘communications’ rendering this design style impractical. With each additional input channel, the percentage of idle time would increase even further. Loosely synchronous techniques would require some form of buffering (FIFOs) on the receiver and/or transmitter sides, again, coupling IP design with the communication (interconnect) design. This would increase the chip's design time significantly. Fully asynchronous interconnects offer the highest degree of robustness and decoupling of different chip design activities.
In a typical asynchronous pipeline, data is transferred from one stage to the next via a sequence of handshaking signals. A stage would latch a datum when it receives a Request (REQ) signal from the preceding stage while the next stage had already indicated that it had latched the previous datum (by de-asserting the Acknowledge signal). Traditionally, there have been two main handshaking protocols for asynchronous data exchange: four-phase handshaking and two-phase handshaking. When combined with dual-rail data encoding these protocols yield delay-insensitive (or at least Quasi-delay-insensitive) operation. A four-phase protocol uses a return-to-zero (RZ) data format requiring 4 steps (or trips) to complete a single datum transfer. The transmitter initiates a datum transfer by driving one of the pre-charged data lines low (or high depending on the pre-charged value). The receiver detects the difference between the data lines using a simple CMOS gate, generates the request, latches in the data if the acknowledge signal coming from the next stage is low and forces its own acknowledge high.
This signals the transmitter that the transfer is successful and it responds by pre-charging the data lines, the pre-charging being detected at the receiver as the request signal transitions down. The receiver now responds by lowering its acknowledge signal indicating to the transmitter that it is ready for a new data. Since data is level-encoded, conventional circuits can be used in the transmitter and receiver. The two-phase protocol is very similar except that it uses a non-return-to-zero (NRZ) data format (no pre-charging) requiring only two steps to complete a datum transfer. For this protocol, data is transition encoded, which requires special circuitry to detect and handle the two possible transitions. Latency and throughput are major concerns. Due to handshaking, each datum transfer would require at least two round trips. Interconnect pipelining and repeaters can improve latency and throughput.
Many researchers have proposed new solutions to improve latency and throughput of asynchronous pipelines. In some systems control pulses are used instead of traditional transition-coded control. This allows faster acknowledge at the expense of more complex circuit design to precisely control pulse widths and match the wire delays. Other researchers proposed a form of wave-pipelining called surfing interconnects where they remove two-way handshaking altogether. This adversely affects the robustness of circuits and increases the design time significantly. By trading off design time (complexity) for speed, flow control is sacrificed. Asynchronous handshaking not only ensures proper timing of valid data but it also allows receivers to control the flow of data, an essential feature in SoCs. Using FIFO buffers instead of handshaking would require flow control at higher levels of the protocol stack.
Surfing interconnects resemble source synchronous communications with the request signal being used to strobe the data at the receiver and repeaters with adjustable delays as delay lines. Efficient source synchronous on-chip serial communication circuits have been proposed wherein the data and clock are re-timed at the receiver side instead of repeaters along the control line. However, flow control would have to be handled at higher levels of the communication protocol stack, something that SoC Ws might not be designed for.
Another concern with asynchronous interconnects is the use of non-standard CMOS circuits. Hence, developing robust asynchronous circuits that can be used as ‘plug-and-play’ hard macros is highly desirable. This can be achieved through the use of delay-insensitive design techniques. What is needed is an interconnect system that achieves reasonably low latency, has a simple architecture, maintains RZ signaling protocol, and retains the robustness of delay-insensitive asynchronous circuits.
Thus, a two-phase return-to-zero asynchronous transceiver solving the aforementioned problems is desired.