All electronic systems are inherently asynchronous in nature. By carefully choreographing transitions with clock signals, asynchronous circuitry can be adapted to appear to behave synchronously. Such synchronism brings significant advantages: it greatly simplifies the design effort; also, with predictable timing, one can make performance guarantees. However, synchronism comes at a significant cost: one must create a clock distribution network (CDN) that supplies a common reference signal to all synchronous components. The CDN distributes the clock signal from a single oscillator to stateholding components, such as flip-flops. Historically, the primary design goal for CDNs has been to ensure that a single clock signal arrives at every synchronous component at precisely the same time to ensure zero clock skew. Another typical design goal for CDNs is to maintain signal integrity while distributing the clock widely. In the ideal case, transitions in the clock signal should arrive at all state-holding elements at precisely the same moment (so there is zero clock uncertainty). Achieving this synchronization and signal integrity can be difficult and costly in terms of design effort and resources. In modern large-scale integrated circuits, the CDN accounts for significant area, consumes significant power, and often limits the overall circuit performance. With increasing variation in circuit parameters, designing CDNs with tolerable clock skew is becoming a major design bottleneck.
Completely asynchronous design methodologies have been studied for decades, but these have never gained widespread acceptance. Instead of synchronizing transitions with a global clock, asynchronous systems are organized as a set of components that communicate using handshaking mechanisms. One drawback of asynchronous methodologies is the overhead and silicon real estate required for the handshaking mechanisms. Circuits with multiple independent clock domains, such as circuits that are globally asynchronous, but locally synchronous (GALS), have become common. GALS architectures consume less dynamic power and can achieve better performance than architectures with a single clock domain. However, the circuitry for domain crossings is complex and problematic. Splitting the clock domains reduces the cost of the distribution network, but relatively complex circuitry for handshaking is needed at domain crossings, so the splitting is only performed at a coarse level.