This invention relates to static analysis of integrated circuit designs.
Prototyping a VLSI (very large scale integrated circuit) design is extremely expensive: fabbing (fabricating) a pass of a prototype full-custom VLSI chip would take several months and would cost several hundred thousand dollars. If the chip design is flawed, the chip itself is almost impossible to probe to isolate the problem and determine corrections to the design. For this reason, virtually all VLSI chips are designed and thoroughly verified by software modelling before the first actual silicon is fabbed.
A timing verifier is one program in the suite of tools used by a VLSI designer. Timing verification is the process of analyzing the circuit model to ensure that the signals propagate through the logic quickly enough to meet the timing requirements at a specified clock frequency. (A timing verifier may also have bundled in other analyses, for instance for race conditions or other logic problems.) Once the circuit has been largely designed using other tools of the suite, the timing verifier is used to improve it, e.g., to eliminate bottlenecks that would force the circuit to be run at a slow clock frequency. The timing verifier takes as input a description of the circuit and its interconnections, the impedances and/or loading of the wires, specifications of the devices in the logic path, and descriptions of the clocked elements, and produces as its output timing of the slowest paths, i.e., the "critical paths", from which the designer can deduce the maximum clock frequency at which the circuit can be run. The designer can then redesign the critical paths to speed them up, thus speeding up the entire circuit. This process is typically iterative: the designer runs the timing verifier, and modifies his circuit design using the information generated. He repeats this process until the number of critical paths with the same timing limit is so large that reducing the time of all of them becomes impractical.
In a synchronous integrated circuit (IC) design, major signals captured in latches at clock edges and are held at stable values when and while the clock is deasserted. The value of the signal at the output of a latch, a latched signal, is only allowed to change during the time the clock signal is asserted. During the time the clock is asserted, changes on the D input to the latch immediately propagate through the latch to the Q output; thus the clock assertion is said to make the latch transparent. The latched signals propagate downstream through combinatorial logic to other latches. The timing verifier reports any latches (or other clocked element) whose inputs are not stable in time to meet the requirements of the latch's clock.
FIG. 1 depicts a simple illustrative circuit, which will be considered under a simplified model of timing constraints and design rules. Two input signals A 100 and B 102 are latched by latches 108 and 110. Thus, signals A' 112 and B' 114 are stable except when the two latches 108 and 110 are transparent, which occurs when clocks Ck.sub.A 104 and Ck.sub.B 106 are asserted. Once A' and B' have been latched, they remain stable, and combinatorial logic CL.sub.1 116, CL.sub.2 120, and CL.sub.3 122 compute signals Y 124 and Z 126. Each of CL.sub.1, CL.sub.2, and CL.sub.3 impose a certain delay in this computation. The downstream part of the design (not shown) relies on Y 124 and Z 126 being latched by latches 132 and 134 on clocks Ck.sub.Y 128, and Ck.sub.Z 130. Thus, CL.sub.1, CL.sub.2, and CL.sub.3 must be fast enough to meet the setup requirements of latches 132 and 134.
FIG. 2 presents a timing diagram for the circuit of FIG. 1. The first three lines show the clocks Ck.sub.A 104, Ck.sub.B 106, Ck.sub.Y 128, and Ck.sub.Z 130. In this example, A and B are latched on the same clock. Signals A and B must be stable far enough before the falling edge of Ck.sub.A /Ck.sub.B 206 to accommodate a "setup time" 208, a characteristic of latches 108 and 110. Once latches 108 and 110 become transparent during Ck.sub.A /Ck.sub.B 204, (assuming that the setup time and the data-to-output time of the latches are equal) signals A' and B' are allowed to transition until they are latched on the falling edge of Ck.sub.A /Ck.sub.B 206. A' and B' drive CL.sub.1, CL.sub.2, and CL.sub.3, which in turn produce signals X, Y, and Z. Under the simplified timing rules, the timing constraints of the circuit are satisfied if the propagation delay 208 of latch 108 plus the propagation delays through CL.sub.1 216 plus CL.sub.2 220 plus the setup time 232 of latch 132 is less than the time from the fall of clock Ck.sub.A /Ck.sub.B to the fall of clock Ck.sub.Y 228, and if the propagation delay 208 of latch 110 plus the time delay through CL.sub.1 216 plus CL.sub.3 222 plus the setup time 234 of latch 134 is less than the time from the fall of clock Ck.sub.A /Ck.sub.B to the fall of clock Ck.sub.Z 230. The paths of A'--CL.sub.2 --Y and B'--CL.sub.3 --Z must also meet the timing requirements of latches 132 and 134, but these will be trivially satisfied because they are clearly faster than paths A'--CL.sub.1 --X--CL.sub.2 --Y and B'--CL.sub.1 --X--CL.sub.3 --Z. When all these conditions are satisfied, the circuit is said to pass timing verification.
If the circuit fails timing verification, the timing verifier will report the critical paths that failed. Either the logic on the slow paths needs to be redesigned to be faster, or the clock frequency needs to be slowed down to accommodate the timing of the circuit.
Timing verifiers operate on one of two general paradigms: dynamic or static.
In dynamic timing verification, the circuit design is simulated through time. The engineer must determine model input stimuli with which to drive the circuit model, called test vectors. Applying dynamic timing verification to the sample circuit to FIG. 1, the timing verifier would successively apply twelve stimuli where either A or B or both undergo transitions: AB-&gt;AB={00-&gt;01, 00-&gt;10, 00-&gt;11, 01-&gt;00, 01-&gt;10, 01-&gt;11, 10-&gt;00, 10-&gt;01, 10-&gt;11, 11-&gt;00, 11-&gt;01, 11-&gt;10 } and run a loop to simulate time, during which model clock Ck.sub.A /Ck.sub.B would undergo several transitions. The circuit model would be operated through time to see at what time signals Y and Z stabilize. Dynamic timing verification is effective in that it is capable of diagnosing all timing problems, at least for the test vectors applied. But in modern circuit designs, the super-exponential combinatorics on tens of thousands of signals is fatal to the dynamic approach: there simply isn't time to test all possible combinations of inputs (most of which would never arise in actual operation), nor for a human to filter out a set of meaningful test vectors that will test all the effective paths.
In the second paradigm, static analysis, there is no loop simulating the passage of time. Static analysis is to dynamic analysis as theorem proving is to case analysis: instead of attempting to simulate a "large enough" number of specific cases, a static timing verifier "reasons" about the circuit model and draws inferences about whether the circuit will meet its timing constraints. This generally involves analyzing every node--i.e., every wire--in a circuit and calculating transition times based on the arrival time of inputs and the propagation delay through the structures. As the times of the transitions of the inputs to a node are analyzed, only the latest transition (in time) is saved, and the algorithm immediately stops tracing any path that is known not to be the worst case. This process, called information pruning, is required to keep the execution times reasonable.
One known algorithm for static timing verification is a depth-first search (DFS) of the circuit starting at each signal guaranteed on a clock edge, labelling each node with the currently best-locally-known worst-case timing information. After all nodes have been labelled, a second pass examines all timing constraints to tell the designer whether the circuit as a whole meets its timing constraints.
Consider the circuit of FIG. 3, in which a first stage of the circuit has two paths of different delay times, which join at a multiplexer. The output of the multiplexer fans out in a second stage of two paths of different delay times, which are joined at a second multiplexer. The DFS algorithm represents each node of a circuit by a data structure as shown in FIG. 4. The node has a name, a "worst case arrival time," and a pointer to the node that drove this worst-case transition.
FIGS. 5a-e depict a DFS analysis of the circuit of FIG. 3: FIG. 5a shows a time-sequence of stack states, and FIGS. 5b-e show a time sequence of states of data structures.
In the DFS algorithm, the graph of the nodes of the circuit is walked in a depth-first order. The algorithm's walker maintains a "current arrival time," and a stack of nodes. (Since this is a static analyzer, note that the arrival time does not "tick" off time incrementally, it moves forward and back by the discrete amounts of delay of the logic walked.) The DFS walker pushes nodes onto the stack as it traces paths downstream, and pops them as it unwinds back upstream. The walker increments its arrival time as it walks downstream through logic by the time delay of the logic, and decrements it the same amount as it unwinds back. As the algorithm pushes each node, if the walker's arrival time is later than the current "worst case arrival time" (or simply ".time") of the node, then the node is updated with the value of the DFS arrival time, and the node's "worst case predecessor" (or simply ".predecessor") is pointed at the predecessor node down which the DFS walk came, and the DFS continues down the successor nodes. If the DFS arrival time is equal to or earlier than the current node's worst case arrival time, the probe of this path is abandoned, and the node is popped off the stack.
In FIG. 5a, each column depicts a step 300 identified by number, and the value of the DFS arrival time 302 during that step. The state of the DFS stack 304 is also shown, with the top of the stack in bold. The term "labelled" is used to describe information permanently (though overwritably) stored in the representation of the circuit. "Unvisited" is used in a local sense: a node is unvisited if it as not been visited via the current path, even if it has been previously visited via a different path.
step 1: FIG. 5b shows the configuration of the nodes for the circuit of FIG. 3 as the algorithm visits the first node of the circuit, node A 310. All the node names have been filled in. A.predecessor and A.time have been filled in (by the process about to be described in detail). PA0 step 2: Assume that A's list of successor nodes is ordered such that the algorithm visits C, then B. Thus, the algorithm walks to node C. Since the logic connecting A to C, CL.sub.2, consumes 11 ns, the DFS algorithm carries the arrival time 12 as it arrives at C. The algorithm, finding C not already labelled, labels C.time with 12 and points C.predecessor to A. PA0 step 3: The only successor of C is D, through logic consuming 1 ns, so the algorithm proceeds to D and sets D.time 13 and points D.predecessor to C. Assume that D's list of successor nodes is ordered such that the algorithm visits node E, then F. PA0 step 4: Node E is filled in with time 26 and predecessor D. PA0 step 5: Node G is filled in with time 29 and predecessor E. The walk would continue downstream from node G. PA0 step 6: DFS pops its stack to back E. E has no unvisited successors. PA0 step 7: DFS pops its stack back to D. D has an unvisited successor, F. PA0 step 8: Node F is filled in with time 32 and predecessor D. PA0 step 9: When DFS arrives at node G with arrival time 33, it finds the node already labelled, but with a time earlier than the current DFS arrival time. Thus, G is updated with time 33, and G.predecessor is updated to point to node F. Note that pointing G.predecessor from E to F "prunes" from the graph all analysis downstream of E that was computed between steps 5 and 6. The algorithm has proved that E cannot possibly be on the critical path to G nor any node downstream of G. Because G has been relabelled, the nodes downstream of G must be walked again to have their times updated. PA0 step 10: DFS pops its stack back to node F. PA0 step 11: DFS pops its stack back to node D. D has no unvisited successors. PA0 step 12: DFS pops its stack back to node C. PA0 step 13: DFS pops its stack back to node A. The next unvisited successor of A is B. PA0 step 14: B is labelled with time 8 and predecessor A. PA0 step 15: DFS arrives at node D with arrival time 9. The arrival time is earlier than the current time of node D; thus, the algorithm stops probing along this path: all paths downstream of node D through node B are also said to be "pruned."By the same reasoning used in step 9, the algorithm has proved that the critical path to all nodes downstream of D must pass through C, not B. PA0 step 16: DFS pops its stack back to node B. PA0 step 17: DFS pops its stack back to node A. Node A now has no unvisited successors.
The intermediate state after step 5 is shown in FIG. 5c. The "worst-case arrival times" 322 have been filled in with a preliminary estimate of the latest transition time. The .predecessor pointers 320 show a preliminary estimate of the critical path to G, A--C--D--E--G. After the algorithm has visited all downstream logic and popped its stack to G:
The intermediate state after step 9 is shown in FIG. 5d.
Finding no unvisited successors of A, the DFS algorithm is complete. The result of the algorithm is the critical path graph of FIG. 5e. For instance, the critical path to node G can be discovered by tracing the .predecessor pointers from a node; e.g., the critical path to G is seen to be A--C--D--F--G. The critical path graph will be of the form of a forest of trees, each tree rooted at one of the input nodes or interior latches. Paths B--D and E--G have been pruned; no larger path that would have used these paths will be analyzed.
There may be multiple critical path graphs built for a single circuit, for instance one for a rising clock edge and one for a falling edge. Each node will have at most a single out-edge pointing to the latest-transitioning driver node for the given clock edge (or to one of several equally-late transitioning). The critical path graphs superimpose without effect on each other. Without loss of generality, the disclosure will discuss single critical path graphs.
Once the timing verifier has identified the critical path to every node, the designer will redesign parts of the circuit to speed up the logic on the critical path, and then run the timing verifier again. If the designer successfully speeds up a structure on the critical path, subsequent runs of the timing verifier on the altered circuit will very likely produce a different critical path graph.
Pruning is essential to making static analysis practical. A naive DFS walk of a circuit would take time exponential in the number of edges between the nodes of the circuit. Though it is possible to construct artificial examples in which DFS algorithms, even with pruning, exhibit exponential time complexity, in practice pruning reduces the time complexity from exponential to nearly linear. With pruning, a single run of DFS and violation sorting of a full microprocessor design can take about fifteen CPU minutes. Without pruning, such analysis would be infeasible.
Static timing verifiers consider clocks as a distinct class of inputs from all other signals. Clocks are treated as independent variables--as the givens of the system. The times of all other signals are stated relative to the clock's phase boundaries.
Some systems have two (or more) subsystems operating at different frequencies, both derived by frequency dividing a single primary clock. For example, the frequency dividers 602 and 604 of FIG. 6 convert a 20 ns symmetric primary clock 600 into a fast 40 ns cycle time divided into four phases of clocks 610-613, and a slow 120 ns cycle time divided into three phases of clocks 620-622. The fast cycle has four phases, .phi..sub.1F, .phi..sub.2F, .phi..sub.3F and .phi..sub.4F, with each of the four phases asserted for 10 ns. The slow cycle has three phases, .phi..sub.1S, .phi..sub.2S and .phi..sub.3S, each phase asserted for 40 ns. The fast cycle might be used in the fast-executing CPU core while the slow cycle might be used in peripheral bus operations.
Known timing verifiers analyze systems relative to a single synchronous clock. FIG. 7 shows a timing diagram in the frame of reference in which these known timing verifiers analyze the circuit of FIG. 8 when the circuit is clocked by the clocks generated in FIG. 6. In this circuit, latch L.sub.1 drives combinatorial logic CL.sub.1, which in turn drives latch L.sub.2. Latches L.sub.1 and L.sub.2 are clocked by a signal derived by ANDing selected clock pulses of FIG. 6. Input signal A just meets the setup time requirements of latch L.sub.1, which is transparent when Z.sub.1 =.phi..sub.4F .multidot..phi..sub.1S is asserted. Thus, B must be assumed unstable during Z.sub.1. Because of the "single synchronous clock" constraint, the circuit must be analyzed relative to the fast .phi..sub.F clock, as shown in FIG. 7. Node B is unstable 630 during the time that latch L.sub.1 is transparent 632; in the four-phase .phi..sub.F system, this must be modelled as the time that clock .phi..sub.4F is asserted 634. Node C settles 10 ns later 636. In the four-phase system, latch L.sub.2 must be modelled as transparent 638 during the time that clock .phi..sub.1F is asserted 640. In the four-phase system, C settles too late to satisfy the set-up time requirements of latch L.sub.2, and thus the timing verifier reports a timing violation on L.sub.2. Known timing verifiers do not represent the relationship between Z.sub.1 and Z.sub.2, and therefore do not discern the additional four phases' delay between them.
Known timing verifiers have had facilities by which a user can, within the previous four-phase system, describe particular paths as being "false paths"--i.e., paths that, for reasons known to the user, will never occur in practice. Once the path from L.sub.1 to L.sub.2 has been identified as a false path, the timing verifier can modify its pruning method and analyze the next-most-critical path. However, this introduces a failure-prone manual step. Even with this feature, the timing verifier spuriously reports the path from L.sub.1 to L.sub.2 as a failure. The engineer must analyze the report, and then add this failure to a list of known spurious failures to ignore. He may subsequently modify the circuit in such a way that a previously-reported and now-ignored failure becomes important. But the timing verifier does not discover or communicate the new urgency of the failure.