Modern integrated circuit designs have become extremely complex. As a result, various techniques have been developed to verify that circuit designs will operate as desired before they are implemented in an expensive manufacturing process. For example, logic simulation is a tool used for verifying the logical correctness of a hardware design. Designing hardware today involves writing a program in the hardware description language. A simulation may be performed by running that program. If the program runs correctly, then one can be reasonably assured that the logic of the design is correct at least for the cases tested in the simulation.
Software-based simulation, however, may be too slow for large complex designs such as SoC (System on Chip) designs. Although design reuse, intellectual property, and high-performance tools all can help to shorten SoC design time, they do not diminish the system verification bottleneck, which consumes 60-70% of the design cycle. Hardware emulation provides an effective way to increase verification productivity, speed up time-to-market, and deliver greater confidence in final products. In hardware emulation, a portion of a circuit design or the entire circuit design is emulated with an emulation circuit or “emulator.”
Two categories of emulators have been developed. The first category is programmable logic or FPGA(field programmable gate array)-based. In an FPGA-based architecture, each chip has a network of prewired blocks of look-up tables and coupled flip-flops. A look-up table can be programmed to be a Boolean function, and each of the look-up tables can be programmed to connect or bypass the associated flip-flop(s). Look-up tables with connected flip-flops act as finite-state machines, while look-up tables with bypassed flip-flops operate as combinational logic. The look-up tables can be programmed to mimic any combinational logic of a predetermined number of inputs and outputs. To emulate a circuit design, the circuit design is first compiled and mapped to an array of interconnected FPGA chips. The compiler usually needs to partition the circuit design into pieces (sub-circuits) such that each fits into an FPGA chip. The sub-circuits are then synthesized into the look-up tables (that is, generating the contents in the look-up tables such that the look-up tables together produce the function of the sub-circuits). Subsequently, place and route is performed on the FPGA chips in a way that preserves the connectivity in the original circuit design. The programmable logic chips employed by an emulator may be commercial FPGA chips or custom-designed emulation chips containing programmable logic blocks.
The second category of emulators is processor-based: an array of Boolean processors able to share data with one another is employed to map a circuit design, and Boolean operations are scheduled and performed accordingly. Similar to the FPGA-based, the circuit design needs to be partitioned into sub-circuits first so that the code for each sub-circuit fits the instruction memory of a processor. Whether FPGA-based or processor-based, an emulator performs circuit verification generally in parallel since the entire circuit design executes simultaneously as it will in a real device. By contrast, a simulator performs circuit verification by executing the hardware description code serially. The different styles of execution can lead to orders of magnitude differences in execution time.
An emulator typically has an interface system to communicate with a workstation server (workstation). The workstation provides the capability to load the DUV (design under verification, also referred to as DUT—design under test) model, controls the execution over time, and serves as a debugging interface into the DUV model on the emulator. The execution of these operations may require that the infrastructure clock of the emulator and thus the design clocks to be stopped.
The emulator may also have a stimulus or a co-modeling interface for communications between the DUV model and the test bench model running in the emulator and the workstation, respectively. This interface may also be used for debugging purposes. Due to software nature of operations in the workstation, communications through this interface during emulation often require slowing down or even temporarily suspending design clocks running in the emulator. This is particularly true for emulators used in a simulation acceleration environment or in a hardware/software co-verification environment.
In addition to communications with the workstation, other activities such as the need for multiple accesses to a hardware resource may also require slowing down or temporarily suspending design clock signals running in the emulator. For example, the design may need to read/write several locations of a design memory though a limited number of ports before the next associated design clock rising edge. In order to emulate these operations according to the design, the design clock signals may have to be suspended for a number of cycles of the emulator infrastructure clock signal.
Conflicting clock speed preferences may also exist between an emulator and its hardware targets. In an in-circuit-emulation (ICE) environment, an emulator models a part of a system and connects to real hardware that serves as another part of the system. The real hardware is often referred to as target(s). If a target is static, the emulator can temporarily suspend design clock signals. Emulation resumes normally after the slow speed of communication with the software environment is compensated and the design clock signals are restarted. A dynamic target, however, requires design clock signals to run continuously above a threshold speed. For example, PCI's lowest bus frequency is 33 MHz, which is even faster than the frequency (a few MHz) of a typical emulator infrastructure clock signal. The protocol may run into timeout errors if the clock signal associated with the PCIe bus is stopped for too long or is running at a speed too slow.
Conventionally, a speed-bridging device may be inserted between the emulator and the dynamic target to bridge the speed gap. Even with this device in place, there may still be a threshold speed (although more manageable now) above which the clock signal supplied to the dynamic target by the emulator has to run. One possible solution is to operate the emulator always run at the threshold speed. This solution, however, is usually impractical because a typical threshold speed is too slow. The technology of adaptive clock management, disclosed in U.S. patent application Ser. No. 14/087,531, which is incorporated herein by reference, is an approach that addresses this problem by slowing down the clocks only as much as needed to allow some of these operations that require clock stoppage while still achieving a good overall performance.
The currently available approaches, however, have limitations and may present a negative impact on the debug capability of an emulator. A typical debug strategy comprises employing hardware-triggered trace buffers to temporarily store captured design signal activity data. These trace buffers are circular buffers. Data associated with the most recent design cycle would necessarily replace the least recent one. The number of maximum cycles can be stored depends on the size of the memory and the design size. When a trigger is hit, the tracing stops (either immediately or after running some cycles) so that the user can look at what happened in the design around the trigger for debugging. The clock constraints associated with dynamic targets thus limit how much data can be captured and downloaded. Virtual or target-less emulation techniques may employed to overcome these limitations.