Modern high performance microprocessors have an ever-increasing number of circuit elements and an ever-rising clock frequency. Also, as the number of circuits that can be used in a CPU has increased, the number of parallel operations has risen. Examples of efforts to create more parallel operations include increased pipeline depth and an increase in the number of functional units in super-scalar and very-long-instruction-word architectures. As CPU performance continues to increase, the result has been a larger number of circuits switching at faster rates. Thus, from a design perspective, important considerations, such as the time needed to complete a simulation and the time needed to debug a CPU, are taken into account.
As each new CPU design uses more circuit elements, each often operating at increased frequencies, the time required to simulate the design increases. Due to the increased time for simulation, the number of tests, and consequently the test coverage, decreases. In general, the result is an increase in the logic errors that escape detection before the CPU is manufactured.
After a CPU prototype is initially manufactured and failure modes are uncovered, determining failure mechanisms is time intensive due to the increased CPU complexity. Failure modes may be the result of logic errors or poor manufacturability of a circuit element. In both cases, circuit simulation helps to confirm or refute the existence of a logic error. If no logic errors exist, the manufacturability of a circuit element may be the cause of the failure mode. Even after a logic error failure mechanism is discovered and a solution is proposed, a substantial amount of time may be required to satisfactorily determine that the proposed solution fixes the logic error and does not generate any new logic errors. Circuit simulation is key to the design and debugging of increasingly complex and faster CPUs.
CPU simulation may occur at a “switch-level.” Switch-level simulations typically include active circuit elements (e.g., transistors) and passive circuit elements (e.g., resistors, capacitors, and inductors). A typical switch-level circuit simulator is “SPICE”, which is an acronym for Simulation Program with Integrated Circuit Emphasis. SPICE typically models each element using an equation or lookup table. SPICE can model accurately the voltage and/or current of each circuit element across time.
CPU simulation also may occur at a “behavioral level.” Behavioral level simulations typically use a hardware description language (HDL) that determines the functionality of a single circuit element or group of circuit elements. A typical behavioral level simulation language is “Verilog,” which is an Institute of Electrical and Electronics Engineers (IEEE) standard. Verilog HDL uses a high-level programming language to describe the relationship between the input and output of one or more circuit elements. The Verilog HDL describes on what conditions the outputs should be modified and what effect the inputs have. Verilog HDL programs may also be used for logic simulation at the “register transfer level” (RTL).
Using the Verilog HDL, for example, digital systems are described as a set of modules. Each module has a port interface, which defines the inputs and outputs for the module. The interface describes how the given module connects to other modules. Modules can represent elements of hardware ranging from simple gates to complete systems. Each module can be described as an interconnection of sub-modules, as a list of terminal elements, or a mixture of both. Terminal elements within a module can be described behaviorally, using traditional procedural programming language constructs such as “if” statements and assignments, and/or structurally as Verilog primitives. Verilog primitives include, for example, truth tables, Boolean gates, logic equation, and pass transistors (switches).
HDL languages, such as Verilog, are designed for efficient representation of hardware designs. Verilog has support for handling signals of arbitrary widths, not only for defining and using an arbitrary width signal, but for treating any sub-field of such a signal as a signal in its own right.
Cycle-based logic simulation is applicable to synchronous digital systems and may be used to verify the functional correctness of a digital design. Cycle-based simulators use algorithms that eliminate unnecessary calculations to achieve improved performance in verifying system functionality. Typically, in a cycle-based logic simulator the entire system is evaluated once at the end of each clock cycle. Discrete component evaluations and re-evaluations are typically unnecessary upon the occurrence of every event.
HDL simulations may be event-driven or cycle-based. Event-driven simulations propagate a change in state from one set of circuit elements to another. Event-driven simulators may record relative timing information of the change in state so that timing and functional correctness may be verified. Cycle-based HDL simulations also simulate a change in state from one set of circuit elements to another. Cycle-based HDL simulations, however, evaluate the state of the system once at the end of each clock cycle. While specific intra-cycle timing information is not available, simulation speed is improved.
HDL simulations may be executed on reconfigurable hardware, such as a field programmable gate array (FPGA) chip. The FPGA allows dedicated hardware to be configured to match the HDL code. FPGA hardware provides a method to improve the simulation time. As the design changes, the time required to reconfigure the FPGA arrangement may prohibit the running of many iterations. Also, the number of FPGA chips required for complex designs may be relatively large.
HDL simulations also may be executed on general purpose processors. General purpose processors, including parallel general purpose processors, are not designed specifically for HDL simulations. HDL simulations require a large number of operations of inputs and outputs that use bit-wise operations.
Large logic simulations are frequently executed on parallel or massively parallel computing systems. For example, parallel computing systems may be specifically designed parallel processing systems or a collection, referred to as a “farm,” of connected general purpose processing systems. FIG. 1 shows a block diagram of a typical parallel computing system (100) used to simulate an HDL logic design. Multiple processor arrays (112a, 112b, 112n) are available to simulate the HDL logic design. A host computer (116), with associated data store (117), controls a simulation of the logic design that executes on one or more of the processor arrays (112a, 112b, 112n) through an interconnect switch (118). The processor arrays (112a, 112b, 112n) may be a collection of processing elements or multiple general purpose processors. The interconnect switch (118) may be a specifically designed interconnect or a general purpose communication system, for example, an Ethernet network.
A general purpose computer (120) with a human interface (122), such as a graphical user interface (GUI) or a command line interface, together with the host computer (116) support common functions of a simulation environment. These functions typically include an interactive display, modification of the simulation state, setting of execution breakpoints based on simulation times and states, use of test vectors files and trace files, use of HDL modules that execute on the host computer and are called from the processor arrays, check pointing and restoration of running simulations, the generation of value change dump files compatible with waveform analysis tools, and single execution of a clock cycle.
HDL logic designs are generally processed by a system known as a “compiler” before the logic design can be executed in a cycle-based simulator. Traditionally, cycle-based compilers only calculate and store the logic state for those circuit nodes representing the outputs of sequential logic devices (e.g., flip-flops and latches), referred to herein as “state nodes.” No logic state is stored for intermediate nodes that represent the outputs of combinatorial logic elements. Thus, when a designer wishes to test (by tracing or probing) the logic states of a given combinatorial logic element (referred to herein as a “design node”), one alternative is to re-compile the entire logic design to preserve the tracing nodes. Another alternative is to generate the logic state of the design node from the nearest state node or nodes, separately evaluating all levels of combinatorial logic in between the state nodes and the selected design node. This separate evaluation is performed by evaluating the logic expression of the design node's logic cone. The logic cone of a design node is a combinatorial logic tree bounded by the state nodes and the design node.
If the logic cone is large and has many logic levels many evaluation steps are necessary to calculate the logic state of a design node. With reference to FIG. 2, the logic cone of a design node is displayed. Design node (E) and state nodes (Q1 and Q2) define the boundaries of a logic cone (200). Within the logic cone (200) are five combinatorial logic elements A, B, C, D, and E. Five evaluation steps are necessary to calculate the logic state of design node E shown in FIG. 2