Today's sophisticated SoC (System on Chip) designs are rapidly evolving and nearly doubling in size with each generation. Indeed, complex designs have nearly exceeded 50 million gates. This complexity, combined with the use of devices in industrial and mission-critical products, has made complete design verification an essential element in the semiconductor development cycle. Ultimately, this means that every chip designer, system integrator, and application software developer must focus on design verification.
Hardware emulation provides an effective way to increase verification productivity, speed up time-to-market, and deliver greater confidence in the final SoC product. Even though individual intellectual property blocks may be exhaustively verified, previously undetected problems appear when the blocks are integrated within the system. Comprehensive system-level verification, as provided by hardware emulation, tests overall system functionality, IP subsystem integrity, specification errors, block-to-block interfaces, boundary cases, and asynchronous clock domain crossings. Although design reuse, intellectual property, and high-performance tools all help by shortening SoC design time, they do not diminish the system verification bottleneck, which consumes 60-70% of the design cycle. As a result, designers can implement a number of system verification strategies in a complementary methodology including software simulation, simulation acceleration, hardware emulation, and rapid prototyping. But, for system-level verification, hardware emulation remains a favorable choice due to superior performance, visibility, flexibility, and accuracy.
A short history of hardware emulation is useful for understanding the emulation environment. Initially, software programs would read a circuit design file and simulate the electrical performance of the circuit very slowly. To speed up the process, special computers were designed to run simulators as fast as possible. IBM's Yorktown “simulator” was the earliest (1982) successful example of this—it used multiple processors running in parallel to run the simulation. Each processor was programmed to mimic a logical operation of the circuit for each cycle and may be reprogrammed in subsequent cycles to mimic a different logical operation. This hardware ‘simulator’ was faster than the current software simulators, but far slower than the end-product ICs. When Field Programmable Gate Arrays (FPGAs) became available in the mid-80's, circuit designers conceived of networking hundreds of FPGAs together in order to map their circuit design onto the FPGAs and the entire FPGA network would mimic, or emulate, the entire circuit. In the early 90's the term “emulation” was used to distinguish reprogrammable hardware that took the form of the design under test (DUT) versus a general purpose computer (or work station) running a software simulation program.
Soon, variations appeared. Custom FPGAs were designed for hardware emulation that included on-chip memory (for DUT memory as well as for debugging), special routing for outputting internal signals, and for efficient networking between logic elements. Another variation used custom IC chips with networked single bit processors (so-called processor based emulation) that processed in parallel and usually assumed a different logic function every cycle.
Physically, a hardware emulator resembles a large server. Racks of large printed circuit boards are connected by backplanes in ways that facilitate a particular network configuration. A workstation connects to the hardware emulator for control, input, and output.
Before the emulator can emulate a DUT, the DUT design must be compiled. That is, the DUT's logic must be converted (synthesized) into code that can program the hardware emulator's logic elements (whether they be processors or FPGAs). Also, the DUT's interconnections must be synthesized into a suitable network that can be programmed into the hardware emulator. The compilation is highly emulator specific and can be time consuming.
Once the design is loaded and running in the hardware emulator, it is important to be able to analyze embedded signals for rapid verification and debug. The most common technique for such analysis is through the use of hardware probes that in turn are used to generate triggers. A probe is a hardware line coupled to an integrated circuit for analyzing the state of a signal within the integrated circuit. One or more probes are combined together in various manners to generate a trigger, which is activated in response to an event or the reaching of a state within the circuit. Triggers may be used to turn on or off various streams of data for tracing circuit activity and may be either synchronous or asynchronous. Synchronous triggers have timing coordinated with the system clock while asynchronous triggers can be generated at any time during the emulation.
Obviously, the more probes available to the designer, the more information the designer has for debugging the circuit and the more complex triggers can be defined. In a large circuit, thousands of probes may exist that need to be monitored by a logic analyzer. Unfortunately, the larger and more complex the circuits are becoming, the more probes are needed. However, these probes must be combined and reduced in order to feed limited trigger inputs to the logic analyzer. Thus, to reduce the number of triggers to the logic analyzer, a probe reduction scheme is typically accomplished through the use of standard gates, such as AND and OR gates. For example, multiple probes may be input into a large AND gate so that if all the conditions are true, the trigger is activated.
While the use of AND and OR gates have become the standard for a probe reduction scheme, such solutions do not allow for very complex trigger mechanisms. For example, sometimes it is desirable to have a complex logical combination of probes based on the design. In a simple example, two probes A and B may be logically combined as A&B using an AND gate. To change this simple function to A OR B, while still using only the available AND gate, one would need to invert both A and B to produce !A&!B and invert the result to produce !(!A&!B)=A OR B. Thus, a simple example of A OR B requires three inverters and an AND gate. In reality, the number of probe inputs is much greater and the logical combinations can quickly become too complex to manage.
Additionally, if a change in the probe reduction scheme is desired, it is necessary to recompile the entire design, which is time consuming and costly. For example, if the user wants to change a trigger generation or reduction scheme to better debug the system, it is necessary to change combinatorial logic associated with the trigger signals. But such changing of combinatorial logic requires recompilation of the design.
Thus, it is desirable to provide a more powerful and flexible scheme for trigger generation in a hardware emulation environment.