One challenge in digital logic design is how to pass signals between different timing domains quickly and reliably. In modern logic circuits, such as graphics processing units (GPUs) and advanced processing units (APUs) in which one or more GPUs is combined with one or more central processing unit (CPU) cores, the number of different timing domains is large and is increasing. For example, modern GPUs are massively parallel and incorporate a large number of heterogeneous processing cores that exchange data and control signals. These processing cores are internally synchronous but operate asynchronously with respect to each other because their respective clock signals have no predefined relationship.
In this environment, interface circuits between clock domains have two problems. First, when clocking signals from one timing domain into another timing domain, the signals may experience metastability. Metastability can arise when data is transferred between two clock domains that operate asynchronously with respect to each other. Capturing circuits such as flip-flops are unable to capture an input signal when the input signal changes during a transition in the clock signal, since the input signal is in mid-transition. Not only is the data not captured correctly at that clock edge, but the capturing circuit may itself capture an intermediate mid-point value which is then output to the next stage requiring data. Moreover the time it takes for the capturing circuit to become “unconfused” can in some rare cases be quite long. Once a flip-flop becomes metastable, its output can take a significant amount of time to correctly transition to a recognizable logic state, and sometimes this logic state is not the correct one. The output signal can take many forms during metastability, such as assuming an intermediate voltage and oscillating for an extended period.
Second, ensuring reliable transmission increases latency and impacts performance of the chip. Known interfaces that reduce metastability require a number of clock cycles, which increases latency and lowers the performance of the system, or requires expensive buffering.
In the following description, the use of the same reference numerals in different drawings indicates similar or identical items. Unless otherwise noted, the word “coupled” and its associated verb forms include both direct connection and indirect electrical connection by means known in the art, and unless otherwise noted any description of direct connection implies alternate embodiments using suitable forms of indirect electrical connection as well.