The complexity of System on Chip (SoC) designs is rising inexorably with a simultaneous increase in the number of integrated processors or circuits (IPs) operating with different clock frequency signals. Therefore, the SoC design strategy presents the problem of arranging an efficient interaction between separate processors or circuits to obtain the maximum possible data transfer rate with a low impact on the area.
In the literature, the problem of synchronizing two semi-synchronous integrated processors or circuits (IPs) in an SoC and its various approaches are widely documented. At least in principle, it is possible to design a specific method supporting a correct data exchange between two semi-synchronous clock integrated processors or circuits that work with respective frequencies having a well defined ratio m/n.
A few methods to synchronize semi-synchronous domains are known in the literature for specific frequency ratios, each of which exploits straightforward circuitry to address synchronization issues related to data exchange. However, these approaches are not susceptible for handling all possible frequency ratios m/n.
Moreover, these specific methods often require several signals that need to be generated by a clock generation subsystem to drive reliable transfers, further increasing post-layout simulation for chip validation.
In fact, these specific methods provide that the clock generator subsystem, for each synchronizer instantiated within the SoC, generates, routes and calibrates some control signals needed to store some information, for instance, regarding the frequency ratio and other characteristics of the clock signals which are necessary for a correct data exchange.
For these reasons, it is expensive, at least in terms of time and resources, to use such methods to synchronize semi-synchronous domains, not only because they do not manage in a flexible and general way all the frequency ratios (thus addressing the synchronization problem in a systematic and automatic way), but also because they require a device having a high impact on the clock generator subsystem and on the area of the SoC as a whole.
In the specific literature, a few methods exist that address the problem of providing a synchronization according to a systematic approach and without loading the clock generator subsystem. Such methods exploit conventional dual port RAM or other unconventional FIFO buffers (dual port based methods) where only two stage buffers are used, one of which is synchronous to the faster clock and the other to the slower clock.
Such methods are based on the specific relationship of the two involved clocks, driving the swap of data from the input buffer to the output one in order to synchronize data. Other similar methods exploit a double input buffer to reduce transfer latency and similarly swap the data to the output buffer, avoiding metastability issues
However, methods that insure data synchronization by using a dual port RAM have the disadvantage of introducing large latencies that dramatically reduce the throughput of the system. A dual port RAM inevitably introduces some delay because of the need for the synchronization of the read and write ports that work at different frequencies.
This issue is partially addressed in a method based on a dual port RAM when a data stream is sent but an intrinsic delay remains when communication is bursty. In fact, one datum always needs at least two clock cycles to be transmitted.
Moreover, a method that uses a FIFO (with two or more stages) to buffer a wide data bus, as those generally present in an SoC, is inevitably much more area consuming. This may be a critical aspect that leads to limit as much as possible these approaches.
There is a need for synchronization method that is able to guarantee the maximum possible data transfer for each frequency ratio m/n, without introducing latencies for improper data-rate exchange, and with a low impact on the clock generator subsystem and on the area.
More particularly, there is also a need for managing only the bus communication control signals in a selective way while respecting the timing constraints imposed by the two involved integrated processors or circuits, by the wires buses, by the clocks frequencies and by the constraints of the technology implemented for the system