1. Technical Field
The present invention relates to a circuit simulation and emulation and, in particular, to circuits with multiple clock domains. Still more particularly, the present invention provides a method, apparatus, and program for retiming netlists to partition multiple clock domains.
2. Description of Related Art
Incubated in the verification of digital signal processing and graphics manipulation, emulation technologies are poised for high growth as more companies exploit their ability to run long test vector sequences on hardware models at speeds that allow integration with fabricated periphery devices. The emulation hardware is used in two ways: 1) accelerated simulation where the test vectors are sent and results processed from a host machine; and, 2) in-circuit emulation (ICE) where the inputs and outputs are connected to the periphery devices. The speedups over traditional simulation are significant.
Currently, two different emulator architectures dominate the market: processor array emulators and field programmable gate array (FPGA) based emulators. FPGA emulators allow for netlists to be programmed into multiple function logic cells. Netlists are high level descriptions of a hardware design which include the intended functionality. These cells are then strategically placed within the emulator so that they can be connected together by the wires running between the FPGAs. Currently, FPGA emulation is speed limited due to the technology rather than the netlist size. In fact, in FPGA systems, the gate utilization is low due to the complications involved in routing the FPGA interconnect.
Processor array based emulators map a netlist to the memory spaces associated with each processor. The netlist is evaluated by synchronously stepping through the instructions in the memory space and scheduling communication on a fixed interconnect during a communication phase. This technology has slow throughput time, but much better compile time and more than five times the capacity of FPGA based systems.
Based on the observation that the capacity demands are often driven by the desire to emulate system level hardware, the idea has been proposed in the prior art to emulate each asynchronous system component independently such that the in-circuit hardware could interact with the smaller domains, thus increasing the frequency of each domain, and the emulation model as a whole. However, given an asynchronous netlist, the task of identifying appropriate cutpoints that maintain the full range of functionality is not trivial with respect to the handling of the combination paths between the logic driven by different clocks.
In particular, when signals from domains clocked by different latches fan-in to a new latch domain, it is difficult to determine how to schedule the evaluation of the logic on the combinational path between clock domains. A combinational path is a sequence of gates that provides a new output whenever the input changes. For instance, an AND gate will change from high to low almost instantaneously when one or both of its inputs tranistions to low. These devices do not require a clock. A combinational path will not include a latch. The prior art deals with this situation by replication of the combinational paths and grouping the replicated logic with its respective input domain. Due to the logic duplication, this approach can increase the model size dramatically in logic that has large combinational paths between latch boundaries. Model size can also increase due to the lost optimization potential in grouping the input cones together.
Thus, it would be advantageous to reduce duplication in the emulation of circuits with multiple clock domains.
The present invention provides a technique for partitioning a netlist. The present invention picks a unique color for each clock and traverses the clock tree coloring the latches in support of that clock tree with that color. Thereafter, all latches should be colored. The present invention then colors the combinational fanout cones for each latch and notes any coloring collisions. In the case of a multicolored gate, the present invention retimes the network by moving the terminating latch backwards, towards the collision, to enable single coloring of the gate. The present invention then performs a depth-first search on the fanout logic of each primary input to the first latch encountered or a primary output. If a primary output is encountered, the path is colored with a color representing the free-run domain. Otherwise, the present invention colors the path with the color of the terminating latch. Next, the present invention duplicates the fanin cones for remaining multicolored gates so that a copy of the logic can be incorporated with each independent domain.