In many modern applications, an electronic system is comprised of multiple subsystems. These subsystems could be blocks, modules or discrete chips. For successful operation, one of the key tasks is the data communication among the subsystems. Data communication refers to the work of transferring information from one subsystem to one (or more) other subsystem(s). The information sender is often termed transmitter and the information taker is called receiver. The information transfer can be carried out in either digital or analog fashion. In most modern systems, digital data communication is the preferred method due to its low cost, high data rate and high reliability.
With billions of transistors used in today's large chips, the advantage of uniprocessor architectures is diminishing due to its demand for high power, high clock frequency and the global distribution of clock signal. Multicore chips are emerging as the prevailing architecture in both general-purpose and application-specific markets since this architecture allows the distribution of the computation load to multiple cores which can operate at their optimum speeds (clock frequencies). Consequently, the challenge in architecture design is shifted from computation to communication. As the core count increases, the need for a scalable on-chip communication architecture that can deliver high bandwidth becomes a necessity. Traditionally, bus has been the dominant structure for System-on-Chip (SoC) on-chip communication. However, it does not scale well with the increased number of cores. This leads to the recent architecture of Networked-on-Chip (NoC) communication. In this approach, from any source to any destination, data is routed by logical or physical links using a predefined protocol. NoC is a SoC design strategy that separates the tasks of computation and communication in a controlled way so that each of them can be addressed efficiently.
In this trend of designing large SoC using the NoC communication methodology, a challenging problem is to robustly interface the design domains driven by clocks having different frequencies and phases. FIG. 1 is an example illustrating the problem. In this exemplary system, there are four subsystems 101, 102, 103 and 104. For each of such subsystems, the circuit is designed using the synchronous design principle (i.e. all the circuitries are operating under the control of a clock signal). Each subsystem is called a synchronous domain. Overall, however, the clock signals for the subsystems are independent of each other. Each of the clock signals has its unique clock frequency and phase. For this reason, this system is termed heterogeneously clocked system.
When any of the subsystem needs to communicate with any other subsystem, there is a need of interface adapter 105 for being inserted in between. This interface adapter is required to handle the frequency (data rate) difference existed between the communicating domains. The goal is to 1) prevent data lost and 2) prevent invalid data from being created (i.e. a data being used more than one time). For this reason, a first-in-first-out memory (FIFO) is usually used in the interface for temporarily storing the data.
Two electronic blocks are connected to the input and output of a FIFO: one that writes and one that reads. If certain timing conditions must be maintained between the writing and the reading blocks, the FIFO is termed exclusive read/write FIFO. In exclusive read/write FIFOs, the writing of data is not independent of how the data are read. There are timing relationships between the write clock and the read clock. To use such exclusive FIFO between two blocks that work asynchronously to one another, an additional circuit is required for synchronization. This synchronization circuit usually reduces the data rate considerably. Exclusive read/write FIFO is hardly used in modern applications.
If there are no timing restrictions on how the blocks are driven (i.e. the writing block and the reading block can work out of synchronism), the FIFO is called concurrent read/write FIFO. In concurrent read/write FIFOs, there is no dependence between the writing and reading of the data. Simultaneous writing and reading are possible in overlapping fashion or successively. In other words, two blocks driven by clocks of different frequencies and phases can be connected to the FIFO. Concurrent read/write FIFOs, depending on the control signals for writing and reading, fall into two groups: synchronous FIFO and asynchronous FIFO.
FIG. 2A shows the structure of asynchronous FIFO. In the left drawing of FIG. 2A, asynchronous FIFO 200 has three signals Full_Status, Input_Data, Write_Clock that interface the writing block (the transmitter, or TX) and three signals Empty_Status, Output_Data, Read_Clock that interface the reading block (the receiver, or RX). In the right drawing of FIG. 2A, an implementation is illustrated. FIFO storage 201 has two input pins DIN, PUT and one output pin OK_to_PUT for handling the write operation. DIN is the data input port used to receive the data TX_DATA coming from the TX. PUT is used to receive the write request from the TX. OK_to_PUT is used for outputting the signal that grants or denies the write request. The TX_DATA is controlled by transmitter clock CLKT that is generated from clock generator TX 205. The READY_to_PUT signal is controlled by CLKT and logic cells 202 and 203. The OK_to_PUT is used to stop the clock CLKT though logic cell 204 when certain condition is reached (such as the FIFO is full). For read operation, similar handshake mechanism is employed through pins DOUT, TAKE, OK_to_TAKE, signals RX_DATA, READY_to_TAKE, CLKR, logic cells 206, 207, 208 and clock generator RX 209. Similarly, the receiver clock CLKR can be stopped by OK_to_TAKE if certain condition is reached (such as the FIFO is empty).
FIG. 2B shows the structure of synchronous FIFO. In the left drawing of FIG. 2B, synchronous FIFO 250 has four signals Full_Status, Input_Data, Write_Clock, Write_Enable that interface the writing block (the transmitter, or TX) and four signals Empty_Status, Output_Data, Read_Clock, Read_Enable that interface the reading block (the receiver, or RX). In the right drawing of FIG. 2B, an implementation is illustrated. FIFO storage 251 has two input pins DIN, PUT and one output pin OK_to_PUT for handling the write operation. DIN is the data input port used to receive the data TX_DATA coming from the TX. PUT is used to receive the write request from the TX. OK_to_PUT is used for outputting the signal that grants or denies the write request. The signals OK_to_PUT, PUT and the circuit blocks 252, 253 work together to function as the signals Full_Status and Write_Enable associated with block 250. For read operation, similar operation is carried out through pins DOUT, TAKE, OK_to_TAKE, signals RX_DATA, TAKE, READY_to_TAKE, CLKR, logic cells 254, 255.
The key difference between the asynchronous FIFO of FIG. 2A and the synchronous FIFO of FIG. 2B is that the clock signals are modified by the FIFO status in the case of asynchronous FIFO while the clock signals are free running at fixed rates in the case of synchronous FIFO. Asynchronous FIFO has the advantage of using potentially smaller size of storage since the clock signals can be stopped. It can achieve smaller data latency. But the output from the clock generator (usually an oscillator, such as an inverter ring) is difficult to be made high quality (low jitter, low noise) since the oscillator is turned on and off frequently. In the synchronous FIFO case, the clocks are free-run. Thus, system designer does not need to worry about the generation or the manipulation of those clock signals. Standard digital design EDA tools are sufficient for designer to handle the design task. However, depending on the size of the frequency difference between the TX and RX, it has the drawback of using potentially larger storage.
Refer now back to FIG. 1, to improve the information processing efficiency of the heterogeneously clocked system 100, the key task is to develop an interface adapter 105 that can efficiently handle both the storage management and the clock management under the condition that the driving clocks of the communicating blocks can have different frequencies and/or phases.
This “Discussion of the Background” section is provided for background information only. The statements in this “Discussion of the Background” are not an admission that the subject matter disclosed in this “Discussion of the Background” section constitutes prior art to the present disclosure, and no part of this “Discussion of the Background” section may be used as an admission that any part of this application, including this “Discussion of the Background” section, constitutes prior art to the present disclosure.