1. Field of the Invention
This invention is generally pertinent to the field of Static Timing Analysis.
2. Description of the Background Art
Static Timing Analysis is the process of verifying the timing correctness of a digital circuit during one clock cycle, without the need for simulation. A longest path analysis and a shortest path analysis are performed on the combinational logic that spans between a launching register and a capturing register in order to determine, in the worst case, whether a signal arrives during the active pulse of a clock.
For example, consider the path spanning between the Q1 output of flip-flop FF1 and the D2 input of flip-flop FF2, in FIG. 1. Assuming that both flip-flops are rising edge triggered, the data will be launched by Clk1 at time 0, and will captured by Clk2 at time 15. The arrival and required times at terminals Q1 and D2 will be stored in memory as follows: Arr(Q1)=0, Arr(D2)=12, Req(Q1)=3, and Req(D2)=15.
A timing analyzer maintains the arrival time and the required time at each terminal. These are stored for both Rise and Fall transitions as well as Late and Early mode. Assuming a floating point number is used for each of the timing values, there is a need for eight floats at each pin of the design. This number is even higher for designs with multiple clocks, because multiple arrival/required times (one for each source clock) may be needed.
In multiple clock designs, it is important to keep track of the triggering clock edges at the source and target flip-flops of each path. This allows the timer to compute the total time allowed for a path. For example, in FIG. 1, if FF2 were triggered by Clk1 then the allowed time for the path becomes, Req(D2)=30, instead of 15.
In a timing analyzer, the design is represented by an acyclic directed graph. In this timing graph, nodes represent pins in the design and edges represent delays along nets and library cells. Arrival times are propagated forward in the timing graph from primary inputs or flip-flop outputs, to primary outputs or flip-flop inputs. At each intermediate pin in the design, the arrival times are combined by picking the most dominant. This is complicated by the presence of multiple clocks, since arrival times emanating from different clocks cannot be combined, unless the source and target clock relationship is analyzed a priori. This analysis is not trivial and constitutes the core of the present invention.
For all timing analyzers the Applicant is presently aware of, (including IBM""s Einstimer and Cadence""s Pearl(3)), paths starting with different clock edges are analyzed separately. This can be done in two ways. The first approach is to traverse the timing graph several times, one for each clock source edge. The second approach is to traverse the network once and combine the arrival times in several bins according to their clock source edges.
The two approaches constitute a trade-off between performance and memory requirement. The first approach, relinquishes the memory of paths that have been traversed, however it is inefficient since some sub_paths will have to be traversed several times; one for each source triggered by a different clock edge. Another shortcoming of this approach is that it cannot be used in timing driven-tools environment, (such as incremental timing during logic synthesis), where timing values need to be kept in memory at all times in order to be updated. The second approach is efficient, however, the memory required to store the timing values in separate bins could grow significantly.
The technique of the present invention combines the benefits of the two approaches described above. Arrival times are propagated and combined by traversing the network only once, while the memory required is kept to a minimum. Two additional initialization traversals are also needed. This technique will be described in detail hereinafter, however an example is given below in order to illustrate the difference between it (FIG. 3), and the available art (FIG. 2).
Both FIG. 2 and FIG. 3 show the propagation of arrival and required time on a simple design with two timing paths: path1, Q1-A-Y-D3 is triggered by Clk1, and path2, Q2-B-Y-D3 is triggered by Clk2. Clk2 is similar to Clk1 except that it is shifted in time. For the sake of clarity the following simplifying assumptions will be made: 1) The net delays are zero, and the gate delays for both timing segments (A-Y and B-Y) are equal to one time unit; 2) The flip-flops are rising edge triggered; and 3) Arrival/Required times are stored on the nets rather than pins.
FIG. 2 shows the propagation of arrival and required times along the two paths, according to the method of the available art. The propagation of arrival/required times on path1 and path2, are kept separate. Path1 times are with respect to Clk1, and path2 times are with respect to Clk2. At the output of the AND gate (y), the arrival times are not combined and therefore two separate bins are needed. Note that the required time at pin D3 is different for path1 than for path2 because of the different source-to-target clock relationship, as shown in the timing diagram of FIG. 2. Note that the slack is defined as the difference between the required and arrival times, and is not stored in memory.
FIG. 3 shows the propagation of arrival and required times according to the inventive technique. Instead of carrying the timing values of the common sub-path (Y-D3) separately, they are merged into one arrival/required with respect to Clk2 (the common target clock for both paths). The invention provides a step which indicates how to choose, at each pin in the design, the clock edge with respect to which timing values should be represented. In order to combine the arrival times of both paths at pin Y, the arrival time at pin A needs to be adjusted before it can be propagated to pin Y. Similarly, the required time at pin Y needs adjustment before propagation to pin A.
It is clear from this example that the advantage of this technique is the memory saved at pins Y and D3. This is done at the expense of a minor overhead in converting the timing values from one clock base to another.
The advantage of this invention over the prior art is that it combines the benefits of the two approaches referred to above. It provides a technique for reducing the memory required to store the timing values needed to perform a static timing analysis without the overhead of having to traverse the network repetitively for each clock source edge. The inventive technique produces significant savings for designs with multiple clocks, and it works even in the presence of multi-cycle paths. The technique does not however, reduce the memory required for designs in which all the flip-flops are clocked by the same edge of a single clock.