1. Field of the Invention
The present invention relates in general to integrated circuit (IC) emulators and in particular to a method for programming an emulator to distribute clock signals within an IC emulator for controlling timing of clock sinks.
2. Description of Related Art
IC Clocking
An IC designer typically generates a hardware description language (HDL) netlist describing and IC in terms of the logical relationships between the various signals to be conveyed by networks (“nets”) within the IC. After creating the HDL netlist, a designer can use a synthesis tool to convert it into a gate level netlist describing the IC as a set of interconnected logic gates and other IC components (“cells”) for implementing the logic described by the HDL netlist. The designer then uses placement and routing tools to generate an IC layout specifying a position of each cell within the IC and specifying how the nets are to be routed between the cells.
Most digital ICs use register transfer logic wherein blocks of logic transmit data to one another via synchronizing circuits including clocked circuit devices (“clock sinks”) such as flip-flops and latches that ensure each block's input and output signals change state at predictable times. For example, FIG. 1 illustrates a block of logic 10 receiving and transmitting data signals through a synchronizing circuit including clock sinks 12 and 14 at the inputs and outputs of logic block 10. Clock sinks 12 and 14 ensure that state changes in the input and output signals of logic block 10 coincide with edges of the signals CLK1 and CLK2 clocking sinks 12 and 14.
An IC designer chooses phase relationships between edges of clock signals CLK1 and CLK2 to allow logic block 10 sufficient time after a state change in its input signals following an edge of clock signal CLK1 to appropriately adjust states of its output signals before clock signal CLK2 clocks sinks 14. For example when CLK1 and CLK2 are the same clock signal and sinks 12 and 14 are all clocked on the leading edge of that clock signal, then logic block 10 will have one cycle of the clock signal to fully respond to a change in its input signals. When CLK1 and CLK2 are the same clock signal, but sinks 12 are clocked on the leading edge that clock signal and sinks 14 are clocked on the trailing edge of the clock signal, logic block 10 will have one half cycle of the clock signal to respond to a change in its input signals. Clock signals CLK1 and CLK2 may differ, but to ensure proper phase relationships between clock signals, they are normally derived from a similar clock signal source so that edges of the two clock signals have a predictable and appropriate phase relationship.
An IC typically employs a clock tree to deliver edges of a clock signal concurrently to all sinks that receive it. FIG. 2 depicts in block diagram form a simple clock tree 15 including a network of buffers 20 for delivering a primary clock signal CLK1 from an IC input/output (IO) terminal 16 to a set of clock sinks 18. Although in this simple example clock tree 15 fans clock signal CLK1 out to only eight clock sinks 18, a typical clock tree may deliver a clock signal to thousands of clock sinks. A designer normally employs a computer-aided clock tree synthesis (CTS) tool to lay out an IC's clock trees after a placement and routing tool has established a position for each clock sink 12 within the IC. By appropriately selecting the size and position of each buffer 20 and appropriately routing the conductors interconnecting them, the CTS tool can create a balanced clock tree delivering clock signal edges to all clock sinks 18 with acceptably small timing differences (“skew”) in edge arrival times at the clock sinks.
An IC may internally derive one or more “secondary” clock signals from an externally generated “primary” clock signal arriving at one of the IC's IO terminals. For example FIG. 3 shows a clock logic circuit 22 processing a primary clock signal CLK1 arriving at an IO terminal 18 of an IC to produce a secondary clock signal CLK2. Separate clock trees 24 and 26 deliver the clock signals CLK1 and CLK2 to different sets of clock sinks 27 and 28. When designing clock trees 24 and 26, a CTS tool will adjust path delays through clock trees 24 and 26 to maintain an appropriate phase relationship between the two clock signals arriving at clock sinks 12 and 14. To do so, the CTS tool must account for the path delay through clock logic circuit 22.
Clock logic circuits implement various types of logic. For example clock logic circuit 22 can be a simple inverter when the CLK1 and CLK2 signals are to be of similar frequency but dissimilar phase, or may be a divide-by-N counter when the CLK2 signal period is to be an integer multiple of CLK1. Clock logic circuit 22 can also act as a “clock gate” that can turn the CLK2 clock signal on or off depending on state(s) of one or more input control signals (CONT). For example, as illustrated in FIG. 4, clock logic circuit 22 might include only an AND gate 32. When CONT is a “1”, CLK2 will have the same phase and frequency as CLK1, but when CONT is a “0”, CLK2 will be continuously low (off). ICs often include gated clock signals to halt operation of a particular portion of an IC for diagnostic purposes. Clock gates providing more complicated logic can have many control inputs. A clock logic circuit may have more than one clock signal input and may selectively derive a secondary clock signal from any one of its input primary clock signals. For example as illustrated in FIG. 5 a clock logic circuit could include a multiplexer 34 for selectively deriving a secondary clock signal CLK3 from either of two input clock signals CLK1 and CLK2 depending on the state of a multiplexer control signal CONT.
Simulators and Emulators
As an IC design progresses through the HDL netlist, gate level netlist and layout stages, the designer will normally employ various tools to verify that an IC fabricated in accordance with the IC design will behave as expected. A computer-based circuit simulator creates a behavioral model of an IC based on a netlist description of the IC either at the HDL or gate level, and the simulator drives the model with simulated input signals so that the model will show how the IC's output signals would behave. Although a simulator can accurately predict the behavior of an IC based on the model, a simulator will normally require substantial amounts of computer processing time to model the behavior of a large IC over even a relatively short period of real time. To reduce processing time, designers often limit coverage of circuit simulations to relatively small custom-designed portions of an IC.
Designers have increasingly turned to circuit emulators to verify the behavior of an entire IC because a circuit emulator can do so more quickly than a circuit simulator. A circuit emulator uses programmable logic devices such as field programmable gate arrays (FPGAs) to emulate IC logic. A typical FPGA includes an array of programmable logic cells for emulating an IC's logic gates and clock sinks and includes programmable signal routing circuits for appropriately interconnecting the programmable logic cells and sinks to one another and to the FPGA's IO terminals.
FIG. 6 is a simplified plan view of a circuit emulator 35 including a circuit board 37 holding a set of eight FPGAs 36. A routing system 38, including for example traces and programmable routing devices mounted on circuit board 37, interconnect various IO terminals of FPGAs 36 with one another and with an interface circuit 39. Interface circuit 39 provides external equipment such as computers, signal generators and logic analyzers with access to IO and programming terminals of FPGAs 36. To program emulator 35 to emulate an IC, an external host computer programs FPGA 36 to emulate separate portions of the IC and programs routing system 38 to appropriately route signals between FPGAs 36 and interface circuit 39. External test equipment can then test the simulated IC by transmitting test signals to FPGA terminals via interface circuit 39 and by monitoring FPGA output signals via interface circuit 39.
In addition to emulating the logic of an IC, emulator 35 must also emulate the IC's clock trees. A clock signal generator 40 on circuit board 37 supplies one or more clock signals to FPGAs 36 through lines of a clock signal bus 42 designed to provide a uniform path distance from clock signal generator 40 to all FPGAs 36 so that clock signal edges arrive concurrently at all FPGAs 36. Clock signal paths inside FPGAs 36 forward each clock signal to various clock sinks therein, also with as little skew as possible.
Since signal path delays between clock sinks within emulator 35 can exceed signal path delays between clock sinks within the IC being emulated, emulator 35 will typically emulate an IC at somewhat lower clock frequencies than the IC being emulated will use. Lowering clock signal frequency increases the time logic blocks have to process their input signals between edges of clock signals clocking their input and output clock sinks. However, even though a circuit emulator typically operates at lower clock frequencies than the IC being emulated, it can normally emulate IC logic much more quickly than a circuit simulator can simulate it.
While clock bus 42 and the clock trees within FPGAs 36 can emulate the function of balanced clock trees within an IC for conveying primary clock signals to clock sinks, they cannot emulate balanced clock trees for conveying secondary clock signals that the IC generates internally. FIG. 7 illustrates a clock signal distribution system including a clock logic circuit 47 in an FPGA 36A deriving a secondary clock signal CLK2 from a primary clock signal CLK1. Secondary clock signal CLK2 clocks sinks 44 in FPGA 36A and sinks 45 in FPGA 36B. Clock bus 42 of FIG. 6 and an internal clock trees of FPGAs 36 can emulate a clock tree for delivering primary clock signal CLK1 to one of FPGAs 36 implementing clock logic 47 and to any other FPGA requiring the CLK1 clock signal. But clock bus 42 cannot forward a secondary clock signal CLK2 generated by one FPGA to other FPGAs. Instead, clock signal CLK2 must pass between FPGAs 36A and 36B through routing circuit 38. Since the paths from clock logic 47 to sinks 44 and 45 are not balanced, clock signal CLK2 can exhibit excessive skew.
FIG. 8 shows one prior art solution to this problem. When an IC has a secondary clock signal such as clock signal CLK2, the designer programs the emulator to replicate the clock logic circuit 47 of FIG. 7 within each FPGA 36A and 36B that is to receive the secondary clock signal so that each FPGA generates the CLK2 signal locally. Since it is not necessary for the emulator to distribute the secondary clock signal CLK2 from one FPGA to another, this approach reduces clock signal skew, at the cost the FPGA logic resources needed to replicate clock logic.
FIG. 9 illustrates a logic block 52 communicating with external circuits through a synchronizing circuit including input and output flip-flops 51 and 53 and a clock logic circuit 50 for deriving a gated clock signal CLK2 for clocking flip-flop 53 from the clock signal CLK1. When an emulator emulates the synchronizing circuit, it can use an FPGA to emulate flip-flops 51 and 53. FIG. 10 shows how a typical FPGA implements flip-flop 51 of FIG. 9 using a pair of latches 54, 55, and an inverter 56, and FIG. 11 illustrates timing relationships between the various signals of FIG. 10. Latch 54 drives signal X at its Q output to the state of signal A at D input of flip-flop 51 input while the CLK1 signal is high and holds the X signal at its current state while the CLK1 signal is low. Latch 55 drives signal B at the D output of flip-flop 51 to the state of signal X while the CLOCK signal is low and holds the X signal at its current state while the CLOCK signal is high. After sampling signal A on the trailing edge of CLK1, flip-flop 51 must hold the state of signal B long enough to allow logic block 52 time to drive its C output to the appropriate logic level. Clock logic circuit 50 must respond to the CLK1 signal edge by delivering a trailing edge of clock signal CLK2 signal to flip-flop 53 while signal C resides at its valid logic level.
FIG. 12 models signal path delays in the circuit of FIG. 9. A delay D1 models the path delay of the clock tree delivering the CLK1 signal to flip-flop 51, and a delay D2 models the total path delay from the CLK1 clock signal source to the CLK2 signal input of flip-flop 51, including the delay though clock logic circuit 50 of FIG. 9. Delay 3 models the time logic block 52 requires to drive signal C to a valid logic level in response to a change in state of signal B following an edge of the CLK1 signal. When D2<D1+D3, clock signal CLK2 will signal flip-flop 53 to sample signal C before it reaches a valid logic level. Thus D2 should be at least as large as D1+D2. However, if delay D2 is too long relative to the period of CLK1, a next edge of clock signal CLK1 may cause logic block 52 to change the state of signal C before flip-flop 53 can sample it. This is called a “hold time error” because logic block 52 fails to hold its output signal at a valid state long enough for flip-flop 53 to sample it. Thus it is necessary to control relationships between delays D1, D2 and D3 so that flip-flop 53 will always sample signal C at the right time.
A clock tree synthesis tool can accurately control the relationships between delays D1, D2 and D3, though D3 is fixed by the nature of logic block 52, because the CTS tool can control delays D1 and D2. However in an emulator, delays D1, D2 and D3 are fixed by the emulator architecture and there is no opportunity to precisely adjust any of those delays to ensure that edges of clock signals CLK1 and CLK2 clock flip-flops 51 and 53 with appropriate relative timing.
What is needed is a method for processing a netlist describing an IC to identify various kinds of clocking problems and for modifying the netlist to resolve them, where possible, so that an emulator can emulate the IC described by the netlist.