Strict global synchrony is becoming prohibitively difficult to implement in large chips. Increasingly complex clock distribution techniques used to minimize clock skew, e.g. involving distributed active skew control, are taking an increasing portion of the total power consumption, more than 30% in high-end microprocessors. Clock distribution using standing waves has also been proposed. These facilitate high-speed clocks with very low skew. However, the clock frequency is dependent on parameters of on-chip components, as it is implemented as standing waves in a grid structure. Alternatively a larger skew is accepted, at the cost of performance, since the timing margin incurred constitutes an increasing percentage of the total cycle time. Ultimately, failure to live up to the challenges of implementing a globally spanning synchronous clock signal may render an entire chip non-functional due to hold time violations.
Meanwhile, physical issues as well as design complexity issues push for a modularized design approach. There is a general consensus that the design tasks of future billion transistor system-on-chip designs are best accommodated by plugging together individually verified blocks, using shared, segmented chip-area interconnection networks. Recent years have seen research into the area of so called Network-on-Chip (NoC). NoC facilitates a truly modular and scalable design approach for Systems-on Chips (SoC).
The partitioning of chip functionality into submodules, or cores, enables a timing-wise partitioning as well. The globally asynchronous locally synchronous (GALS) approach implements synchronous islands which communicate asynchronously. Drawbacks of the GALS approach include the risk of data and control metastability in crossing the boundary between the asynchronous and synchronous domain as well as the overhead of implementing circuits to provide timing-safe cross domain transmission.
Alternatively, mesochronous clocking may be applied. Mesochronously clocked systems employ a single clock across the entire system, but with different phases. In a generalized form, nothing can be said concerning the phase alignment between cores in different clock-phase domains. Thus metastability may occur when passing data from one domain to another. Mesochronously clocked systems benefit from leveraging existing synchronous design tools and know-how, while avoiding the drawbacks of strict global synchrony: a peak current at the global clock edge, which leads to ground bounce and voltage drops, which in turn induce jitter in both clock and data, is avoided; also power dissipation in the clock distribution network is significantly reduced since power hungry clock trees to reduce global clock skew are avoided.
Methods for avoiding metastability have been proposed in various forms. Also, work has aimed at containing the clock skew in mesochronous systems, such as in El-Amawy, “Clocking arbitrarily large computing structures under constant skew bound”, IEEE Transactions on Parallel and Distributed Systems 4, 1993, pp 241-255. In this reference, a network of interacting clock generating nodes is presented. The method guarantees an upper bound on local skew. However, the node interaction involves loops and the sign—positive or negative—of the skew is not guaranteed, only the absolute value. Thus in a practical system, hold-time violations are still possible. Also, the practical implementation of the nodes is somewhat complex, introducing a non-negligible overhead.