Hardware Description Languages (HDLs) have been used for many years to design digital circuits. Such languages employ text-based expressions to describe hardware devices, enabling designers to design larger and more complex systems than possible using previously known gate-level design methods. With HDLs, designers are able to use various constructs to fully describe hardware components and the interconnections between hardware components. Two popular Hardware Description Languages are Verilog, first implemented by Phil Moorby of Gateway Design Automation in 1984, and later standardized under IEEE Std. 1364 in 1995, and VHDL (Very High Speed Integrated Circuit (VHSIC) Hardware Design Language), standardized in IEEE Std. 1076. Both of these languages, and other similar languages, have been widely used to design hardware circuits.
At one level of abstraction, Verilog and VHDL may operate as Register-Transfer Level (RTL) Languages in which circuits have, or are abstracted to have, a set of registers. A designer may use an RTL description to specify the values of the registers in each clock cycle in terms of the values of the registers in the proceeding clock cycle. In this way, an RTL model implements a finite state machine (FSM) of the circuit to be specified.
At another level of abstraction, Verilog and VHDL support a behavioral specification approach. In a behavioral specification approach, the focus is on the functions performed by the circuit, rather than on individual register values. One language that is particularly adapted to this type of approach is SystemC, an open-source kernel that extends the C++ language to enable hardware modeling.
Yet as the complexity of digital circuits has increased, both RTL and behavioral circuit specification techniques have shown their limitations. New HDLs utilizing Term Rewriting System (TRS) technology have addressed some of the limitations of the conventional methods. A TRS adapted for hardware design employs a list of “terms” that describe hardware states, and a list of “rules” that describe hardware behavior. A “rule” captures both a state-change (an action) and the conditions under which the action can occur. Further, each rule has atomic semantics—that is, each rule executes fully without interactions with other rules. This implies that rules may be considered in isolation for analysis and debugging purposes.
More formally, a Term Rewriting System has rules that consist of a predicate (a function that is logical true or false) and an action body (a description of a state transition). In an alternate terminology, the predicate may be called the guard of the rule. A rule may be written in the following form:                rule r: when π(s)=>s:=δ(s)where s is the state of the system, π is the predicate, and δ is a function used to compute the next state of the system. In a strict implementation of a TRS, only one rule may execute on a given state. However, in modern TRSs, a scheduler is typically used to allow concurrent execution of rules if the rules do not conflict. That is, when several rules have predicates that are true, and the rules do not conflict, modern implementations of TRSs take advantage of concurrent execution to generate more efficient hardware. Due to the atomic semantics of the rules, the state resulting from concurrent execution is the same as if the rules had been executed serially. After all applicable non-conflicting rules are executed on a given state of the system, all rules are re-evaluated for applicability on the new state, and the process continues until no further rules are applicable.        
As with conventional HDLs, circuit specifications constructed with HDLs that employ TRS rules are generally structured into a plurality of modules, where each module performs a particular sub-set of functions of the overall circuit design. In contrast to conventional HDLs that generally specify the internal behaviors of modules using well-known always blocks, HDLs that employ TRS rules generally use TRS rules to specify the behaviors. Also, as with conventional HDLs, modules specified using TRS rules generally intercommunicate with each other using defined interfaces. Interfaces provide a structured way to pass signals between modules, and enhance the modularity of designs. An interface to another module may be provided to a module as part of the module's environment, may be instantiated within the module, or may be supplied as arguments to the module.
Furthermore, as with conventional HDLs, HDLs that employ TRS rules typically generate hardware that is synchronous. A synchronous system is one where devices outputs change in response to being triggered at particular intervals by clock signals. The use of clocks advantageously synchronizes the system and is valuable in accounting for propagation delays and other parameters of fabricated devices. While TRS rules themselves seldom explicitly reference clock signals, clock signals are an intrinsic part of virtually all TRS based hardware designs.
Intercommunication among modules is relatively straightforward when all the modules are driven by the same clock, i.e. when all modules are within the same clock domain. A clock domain is defined as a portion of a hardware design that is driven by a single clock signal so that devices within the domain operate in a synchronous, i.e. “in phase”, manner.
Intercommunication among modules is only slightly more complex when modules are driven by clocks of different clock domains but the clocks of the different clock domains are all of the same family. Clocks of the same family are defined as clocks driven by the same oscillator, but possibly differing in gating, i.e. activation. For example, it may be desirable in a particular design to “deactivate” a clock in a particular portion of a circuit to save power when that portion of the circuit is not in use. Accordingly, a control signal may be provided with the oscillator signal of the clock, and the control signal may “gate off” the clock in that portion of the circuit at certain times. Simultaneously, other portions of the circuit may be driven by clock signals from the same oscillator that are “gated on.” While the “gated off” and “gated on” clocks differ in activation, when they are both active they are exactly in phase, and thus considered of the same family.
Intercommunication among modules becomes more complex when the modules are driven by clocks from different clock domains, where the clocks of the clock domains are of different families. Clocks of different families are generally driven by different oscillators and therefore may be considerably out of phase from one another. Due to the lack of a synchronous phase relationship between the clocks, clocks of different families are often referred to as asynchronous clocks. Multiple asynchronous clocks are used in a wide variety of hardware designs. They are particularly common in System-on-a-Chip (SoC) designs that integrate a number of components of a computer or other complex electronic system into a single chip. Asynchronous clocks are used to advantage in SoC designs to support multiple bus systems, for example, they may be used to support the well known Peripheral Component Interconnect (PCI) and Universal Serial Bus (USB) within the same SoC. Furthermore, asynchronous clocks are often used to support the large size of SoCs that may prevent a single “fast” clock from being effectively distributed over the entire design due to transmission delays.
Absent special provisions, when a data or control signal is sent from a module of a first clock domain of a first family to a module of a second clock domain of a second family, the signal will appear as an asynchronous event in the second clock domain. An asynchronous event may cause a flip-flop or other device in the second clock domain to experience metastability, an undesirable unstable state where the device may hold an incorrect value. Metastability may cause the value of a flip-flop to take many times longer than normal to settle into a correct state, or to oscillating several times between states before settling into one state. Furthermore, metastability may propagate from one device to another device causing a chain of devices to all experience metastability.
FIG. 1A is a schematic block diagram of an exemplary hardware design 100 where metastability may occur due to a signal crossing between clock domains of different families. Two clock domains are shown, Clock Domain A 110 which is driven by Clock A, and Clock Domain B 120 which is driven by Clock B, where Clock A and Clock B are driven by different oscillators. Hardware devices in each clock domain are driven by their respective clocks, for example the two flip-flops 130, 140 are driven by Clock A and Clock B respectively. The first flip-flop 130 stores Signal 1 when triggered at the rising edge of Clock A. The output of the first flip-flop 130, labeled Signal 2, is stored in the second flip-flop 140 at the rising edge of Clock B. The second flip-flop 140 in turn produces an output, labeled Signal 3.
FIG. 1B is an exemplary timing diagram that corresponds to the exemplary hardware design shown in FIG. 1A. The timing diagram has been simplified and idealized for purposes of illustration, and thus the signals shown differ somewhat from signals that would occur in an actual fabricated design. Assume that Signal 1 is transitioned from a high state to a low state at a transition region 160. Further, suppose the rising edge of Clock B occurs during the transition region of signal 160. In such a case, the second flip-flop 140 will sample Signal 2 while it is in an intermediate state between the high and low states. By doing so, the setup time (commonly represented tsu) and the hold time (commonly represented th) requirements of the second flip-flop 140 may be violated. The violation causes the flip-flop 140 to become metastable, as shown by the metastable region 170 of Signal 3, where the value of Signal 3 is uncertain. As discussed above, this uncertainty is highly undesirable in a hardware design.
To avoid undesirable metastability, while still allowing signals to be passed between different clock domains, hardware designers typically employ synchronizers to connect the clock domains. There are a variety of commonly used synchronizer designs, including multiple-flip-flop-based synchronizers, handshake-based synchronizers, and FIFO-based synchronizers. Of these, the most commonly used synchronizer design is the two-flip-flop synchronizer.
FIG. 2 is a schematic block diagram of a two-flip-flop synchronizer 200 that is well known in the art. As in FIG. 1, a flip-flop 230 (which technically is not considered part of the synchronizer's two flip-flops) is located in a Clock Domain A 210. The two flip-flops 240, 250 of the two-flip-flop synchronizer 200 are located in Clock Domain B 220, which is of a different clock family than Clock Domain A. When a signal is propagated to the first flip-flop 240 in Clock Domain B 220, there is likelihood that the first flip-flop 240 may enter a metastable state. Yet, even if the first flip-flop 240 become metastable, there is a much smaller probability that the second flip-flop 250 will become metastable as well, when driven by the first flip-flop 240. Thus, by allowing only the second flip-flop 250 to interact with the rest of the devices in Clock Domain B 220, the probability that metastability will propagate is reduced to an acceptable level.
FIG. 3 is an exemplary code excerpt that includes an HDL implementation of a two-flip-flop synchronizer. In this particular example, the language employed is Verilog, yet the general concepts employed here are applicable to other conventional RTL HDLs. A flip-flop in a first clock domain is instantiated at an always block 310 and clocked by a clock signal labeled “comp_clock_in.” The always block also contains logic which sets the flip-flop to a particular value. The second clock domain contains a first flip-flop described by the module “pci_synchronizer_flop” 320. The second clock domain also contains a second flip-flop instantiated by an always block 330. The first and second flip-flops of the second clock domain are driven by the clock signal “req_clk_in” and together the flip-flops form the two-flip-flop synchronizer.
As hardware designs become more complex, the burden on the hardware designer to follow good design practices and to explicitly instantiate synchronizers between clock domains increases dramatically. Indeed, explicit clock synchronization may become untenable as designs continue to increase in size and complexity. Chip failures due to incorrect implementation of synchronizers are very difficult to detect, as they typically manifest themselves as seemingly random errors and lockups. Indeed, for an error to manifest itself, a particular combination of clock-edges and data inputs needs to occur. Such a combination may occur very infrequently and thus detecting such errors may require detailed and exhaustive techniques.
Yet typically many designers attempted to find errors due to incorrect use of synchronizer by visual inspection of the HDL circuit specification and other informal methods. As is apparent, with the large size of many modern designs, these informal methods are inadequate. Designers may also turn to a variety of automated “linting” tools to attempt to detect errors. While these tools are an improvement over informal methods, such as visual inspection, they too have their limitations, as described below. Commonly employed linting tools include Leda™ available from Synopsys, Inc, 1Team:Verify™ available from Atrenta, Inc, and nLint™ available from Novas Inc.
The ability of linting tools to detect clock domain crossing errors is limited due to a number of factors, foremost of which is that conventional HDLs do not well describe how signals are used. For example, most HDLs have relatively few data types, and these data types are mainly used to specify simulation behavior semantics, rather than design intent. In particular, conventional HDLs generally lack a specific data type for clock signals. Thus, linting tools may not turn to data types to determine design intent, and are forced to infer intent based upon common design practices and other estimations. When such inferences are incorrect, linting tool may become confused and thus be unable to recognize certain errors.
Further, since linting tools are generally separate from other design tools and simulators, it is incumbent upon the designer to choose a proper linting tool and to have the discipline to use it at appropriate times. If a designer fails to use an appropriate linting tool, errors will likely be missed, as many other design tools and simulators make little objection to timing errors. For example, it is often possible for a designer to simulate a design containing clock domain crossing errors on a simulator without the simulator objecting to the errors.
Thus, it is desirable to provide HDL functionality that simplifies the handling of clocks, and in particular simplifies passing signals between clock domains. Such functionality should largely remove from the designer the burden of explicitly and correctly instantiating synchronizers, for the majority of commonly encountered situations. Further, such functionality should allow one to create a design that is “correct by implementation,” in that the HDL itself requires proper clock domain crossing synchronization, and thus removes the need for “after-the-fact” linting tools to verify this aspect of the design.