Many electronic systems and applications require the transmission of data between clock domains of varying frequencies. When a logical path crosses from one clock domain to another, the designer of a circuit or system takes into account the timing requirements for all valid clock ratios between the relevant clock domains. The clock ratio between any two domains is typically defined as N:M, where N is the faster clock frequency and M is the slower clock frequency. The clock ratio between two clock domains determines the amount of delay allowed in a logical path that crosses between those clock domains.
Logical implementations of cross-domain interfaces that satisfy multiple clock ratios generally transmit at least some of the data directly from a source domain into a destination domain at the slower clock ratio. In some previous systems, a multiplexer is used to mux the directly transmitted data with other data that is from the source domain through a less direct logical path. A block diagram of an exemplary system having such a configuration is provided in FIG. 1, which is described in detail below.
Referring to FIG. 1, FIG. 1 is a block diagram of an exemplary system 100 for transmitting data between different clock domains including a bypass multiplexer. FIG. 1 shows, for purposes of example, an embodiment suitable for transmitting data from a faster clock domain to a slower clock domain.
As illustrated in FIG. 1, data that is to be transmitted from the faster clock domain to the slower clock domain is transmitted into master latch 101, then shifted into slave latch 102. Latches 101 and 102 reside in the faster clock domain. The data being transmitted as output from slave latch 102 may be separated according to the clock ratio of its logical path. In some systems, data having an N:1 ratio (N:1 transmit data 112) is transmitted directly into multiplexer 105.
In some systems, data having a clock ratio of N:2 (N:2 transmit data 111) is transmitted first into master latch 103, then transmitted into slave latch 104. N:2 transmit data 115 is then supplied from slave latch 104 to multiplexer 105.
Multiplexer 105 selects between N:1 transmit data 112 and N:2 transmit data 115. The selection of multiplexer 105 is controlled by a multiplexer select signal 118. The transmit data 116 output from multiplexer 105 is supplied to downstream logic 107.
In some systems, data is transmitted from downstream logic section 107 into a third latch bank, comprising master latch 108 and slave latch 109, both of which reside in the slower clock domain.
In data transfer interfaces, the clock ratio of a logical path is used to determine whether data traveling that path will be transmitted directly into the destination domain (e.g., the path of N:2 transmit data 111 in FIG. 1) or whether it first travels through the bypass multiplexer (e.g., the path of N:1 transmit data 112 in FIG. 1).
In many electronic circuits, including those that use latches, setup and hold times must be taken into account when designing the circuit to prevent or decrease the likelihood of circuit failure. The presence of jitter and skew in a circuit cause a reference signal to be indeterminate for a period of time before and after a scheduled state change. “Setup” time refers to the minimum amount of time that must exist between a reference signal changing state and a capture event to ensure that the reference signal is accurately captured. “Hold” time refers to the minimum amount of time that a reference signal must be held at its new state after a state change in order to ensure that the new stat is accurately captured.
Generally, for those paths for which the faster clock frequency is an integer multiple of the slower clock frequency (thus an N:1 ratio), the logical path between the clock domains has a full fast clock cycle to satisfy setup requirements. FIG. 2 is an exemplary timing diagram showing the allowable delay time for a multi-domain clock interface in a 2:1 clock ratio mode.
Referring to FIG. 2, the frequency of slow clock signal 202 is half of the frequency of fast clock signal 201, yielding a 2:1 clock ratio.
Arrow 203 shows the width of one full cycle of the fast clock signal.
In the case that slow clock signal 202 is a reference signal and the rising edge of fast clock signal 201 is a capture event, arrow 204 shows the delay between the reference clock launch and the capture event for an N:1 clock ratio. As shown by arrow 204 in FIG. 2, a full cycle of fast clock signal 201 exists between the launch of the reference signal and the capture event. The same is true in the case that fast clock signal 201 is a reference signal and the rising edge of slow clock signal 202 is a capture event, as shown by arrow 205.
For a logical path having a 2:1 clock ratio, the logical path has a full cycle of fast clock signal 201 in which to satisfy setup requirements. Thus, a 2:1 clock ratio allows for the maximum amount of delay possible for resolving timing criticalities.
In the examples described herein, an N:1 clock ratio is assumed for purposes of example to be the ratio in the interface that allows for the most delay. However, some embodiments may not contain any logical paths having an N:1 ratio. The data in an interface that allows for the largest amount of delay may follow the paths described in the examples as the N:1 paths. The present disclosure is applicable regardless of the specific clock ratios present in a particular interface.
Referring again to FIG. 1, because an N:1 path allows for the greatest amount of delay in the logical path, the N:1 path generally will use the path through the bypass multiplexer 105 (FIG. 1). For other clock ratios, the logical path will have a fraction of the fast clock cycle to satisfy setup requirements. See, e.g., description of FIG. 3 below. Logical paths having clock ratios other than N:1 generally use the more direct path to the destination domain.
FIG. 3 is an exemplary timing diagram showing the allowable delay time for a multi-domain clock interface in 3:2 clock ratio mode.
In FIG. 3, the frequency of slow clock signal 302 is two-thirds of the frequency of fast clock signal 302, yielding a 3:2 clock ratio.
Arrow 303 shows the width of one full cycle of the fast clock signal.
In the case that slow clock signal 302 is a reference signal and the rising edge of fast clock signal 301 is a capture event, arrows 304 and 306 show two different possible delay times. In the first instance, represented by arrow 304, a delay equal to one full cycle of fast clock signal 301 is available to satisfy setup requirements between the first slow clock signal 302 launch and the fast clock signal 301 rising edge capture event. However, during the second cycle of slow clock signal 302, a delay of only one-half of a cycle of fast clock signal 301 is available between the first slow clock signal 302 launch and the fast clock signal 301 rising edge capture event. This scenario is represented by arrow 306.
Arrow 305 represents a delay of one-half of a cycle of fast clock signal 301 between a launch of fast clock signal 301 and the next rising edge of slow clock signal 302.
For a logical path having a 3:2 clock ratio, the logical path may have only a fraction of a cycle of fast clock signal 301 in which to satisfy setup requirements. Thus, a 3:2 clock ratio allows for the significantly less delay for resolving timing criticalities, making a logical path having a 3:2 clock ratio significantly more time-critical.
In the examples described herein, an N:2 clock ratio is assumed for purposes of example to be the ratio in the interface that allows for the least delay. However, some embodiments may not contain any logical paths having an N:2 ratio. The data in an interface that allows for the least amount of delay may follow the paths described in the examples as the N:2 paths. The present disclosure is applicable regardless of the specific clock ratios present in a particular interface.
The multiplexer implementation of previous systems, such as the one of FIG. 1, causes significant difficulties with resolving timing violations. For example, in such an implementation as FIG. 1, clock skew and jitter reduce the amount of delay allowed in a logical path. FIG. 4 shows an example of how clock skew and jitter can reduce the delay available to satisfy setup requirements in an N:1 or N:2 logical path.
Referring to FIG. 4, FIG. 4 is an exemplary timing diagram showing the timing implications of clock skew and jitter for data transmission from a faster clock domain to a slower clock domain. The same principles hold true for transmissions of data from a slower clock domain to a faster clock domain.
The effects of clock skew are shown with relation to fast clock signal 401 and slow clock signal 402. Clock skew and jitter reduce the amount of delay allowed in both the N:1 and N:2 paths. In the example of FIG. 4, the clock ratio between fast clock signal 401 and slow clock signal 402 is 3:2.
Arrow 403 shows the width of a full theoretical cycle of fast clock signal 401. Arrow 404 shows the width of a cycle of fast clock signal 401 minus the jitter time of that signal. Arrow 405 shows the width of a cycle of fast clock signal 401 plus the jitter time of that signal. The space between the right edge of arrow 404 and the right edge of arrow 405, then, represents the time during which the state of fast clock signal 401 is indeterminate. Similarly, reference 406 shows the times during which slow clock signal 402 may be indeterminate due to skew.
Arrow 407 shows a potential hold problem that is caused by the indeterminate arrival times of the fast clock signal 401 launch and slow clock signal 402 rising edge capture.
Arrows 408 and 409 show setup delays at two different cycles of slow clock signal 402. Note again that for clock ratios other than N:1, delay times for satisfying setup and hold requirements may vary because of odd clock ratios. In this instance, there is much more delay time available in the downstream clock cycle (arrow 409), as contrasted with one cycle of slow clock signal 402 earlier (arrow 408).
Some embodiments of the present invention, described in detail below, allow for greater flexibility in resolving the timing criticalities explained above, including criticalities related to clock skew and jitter.
In the case of logical paths that cross clock domains, the presence of skew and jitter may result in timing criticalities in any logical path, regardless of clock ratio. This is true whether data is being transmitted from a slower clock domain to a faster clock domain or from a faster clock domain to a slower clock domain.
Another limitation of the multiplexer implementation of FIG. 1 is that the multiplexer delay slows down the worst-case timing for the data path through the multiplexer 105. As a result, a circuit designer makes adjustments to ensure that the setup requirements are satisfied. Such adjustments commonly involve the usage of low voltage threshold (low-vt) devices or cycle stealing. Cycle stealing is a method by which a clock signal period may be manipulated to resolve timing criticalities at selective signal launch and capture points in a system. For example, if the delay of a particular logical path is longer than the period of its capture clock signal, the arrival time of the clock signal to the downstream latch may be delayed to effectively lengthen the path to the latch. Such a solution also results in the logical path on the other side of a cycle-stolen latch having an allowable delay that is less than the clock period by the amount of the cycle-steal delay, which in some systems may cause another potential timing criticality and require further measures to stabilize the system. Conversely, rather than delaying the capture latch, the clock arrival time of the launching latch could be accelerated to prevent a potential timing failure.
Solving timing criticalities entirely with low-vt devices would require using a low-vt device in each multiplexer bypass path, which would result in a substantial increase in leakage power for most multi-bit interfaces. Therefore, cycle stealing has commonly been preferred as a more power-efficient solution. However, as data transmissions have increased and circuits have become more complex, cycle stealing alone has often not been able to resolve all timing criticalities. The tight timing characteristics of many modern systems have required both low-vt devices and cycle stealing to be implemented, often resulting in relatively low maximum worst case frequencies for logical paths.
Additionally, configurations that require a full bypass multiplexer for each bit of a cross-domain interface have large area and power requirements. For example, in the system of FIG. 1, at least one multiplexer is required for each bit of data to be transmitted. Even for the simplest interface, the requirement can add up to hundreds or thousands of multiplexers.
Hence, if the number of devices, such as multiplexers, required to be used in data transmissions between clock domains of varying frequencies could be reduced or replaced in such a manner as to reduce the required area, then power requirements may be reduced. Further, a solution that replaces multiplexers with other components (e.g. clock splitters), may allow for more flexibility in resolving timing criticalities. The reduction of required area and power and allowance for more flexibility in resolving timing violations may further yield an increase in the worst-case frequency of data transmissions.
Therefore, there is a need in the art for improvements in the performance and efficiency of multi-clock-domain data transmission interfaces.