A goal of a statistical static timing analysis (SSTA) is to determine the latest and earliest possible switching time distributions of various signals within a digital circuit. SSTA may generally be performed at the transistor level or at the gate level, using pre-characterized library elements including those at higher levels of abstraction for complex hierarchical chips.
SSTA algorithms operate by first levelizing the logic structure, and breaking any loops in order to create a directed acyclic graph (timing graph). Modern designs can often contain millions of placeable objects, with corresponding timing graphs having millions, if not tens of millions of nodes. For each node, a corresponding arrival time (AT), transition rate (slew), and required arrival time (RAT) are computed for both rising and falling transitions as well early and late mode analysis. Each such value may be represented in general as a distribution, for instance, using a first-order canonical form, wherein timing quantities are represented as functions of underlying sources of variation, as described in U.S. Pat. No. 7,428,716 to Visweswariah. An arrival time (AT) distribution represents the latest or earliest time at which a signal can transition due to the entire upstream fan-in cone. The slew distribution represents the transition rate associated with a corresponding AT, and a required arrival time (RAT) distribution represents the latest or earliest time at which a signal must transition due to timing constraints in the entire downstream fan-out cone.
ATs are propagated forward in a levelized manner, starting from the design primary input asserted (i.e., user-specified) arrival times, and ending at either primary output ports or intermediate storage elements. For single fan-in cases,AT sink node=AT source node+delay from source to sink.
Whenever multiple signals merge, each fan-in contributes a potential arrival time computed asAT sink (potential)=AT source+delay,
making it possible for the maximum (late mode) or minimum (early mode) of all potential arrival times to be statistically computed at the sink node. Typically, an exact delay function for an edge in a timing graph is not known, but instead only a range of possible delay functions can be determined between some minimum delay and maximum delay. In this case, maximum delay functions are used to compute late mode arrival times and minimum delay functions are used to compute early mode arrival times.
RATs are computed in a backward levelized manner starting from either asserted required arrival times at the design primary output pins, or from tests (e.g., setup or hold constraints) at internal storage devices. For single fan-out cases,RAT source node=RAT sink node−delay.
When multiple fan-outs merge (or when a test is present), each fan-out (or test) contributes a prospective RAT, enabling the minimum (late mode) or maximum (early mode) required arrival time to be computed statistically at the source node. When only a range of possible delay functions can be determined, a maximum delay function is used to compute late mode required arrival times and a minimum delay function is used to compute early mode required arrival time.
The difference between the arrival time and required arrival time at a node (i.e., RAT−AT in late mode, and AT−RAT in early mode) is referred to as slack. A positive slack implies that the current arrival time at a given node meets all downstream timing constraints, and a negative slack implies that the arrival time fails at least one such downstream timing constraint. A timing point may include multiple such AT, RAT, and slew values, each denoted with a separate tag, in order to represent data associated with different clock domains (i.e., launched by different clock signals), or for the purpose of distinguishing information for a specific subset of an entire fan-in cone or fan-out cone.
Clock skew generally refers to the difference between ideal and actual arrival times at a clock inputs. In the case of a setup test, for example, a data signal may launch slightly later than the ideal clock reference time, and/or the capture clock signal may arrive slightly earlier than the ideal clock reference. In such a circumstance, these differences between actual and ideal clock arrival times (clock skew) will rob from the effective cycle time available for a signal to propagate on a given latch to latch path. Similarly, for a hold test, the case where launching latch clock arrival time arrives earlier than ideal and/or capture latch clock arrives later than ideal can force the need for additional padding on a given latch-to-latch path so as to avoid an early mode race condition.
It is therefore desirable during SSTA to diagnose the impact on the slack of a failing path due to clock skew. This information can then be used by a designer to determine whether additional skew optimization in the clock tree will be useful to close on SSTA, and to determine how to best tune the clock tree arrival time functions in order to minimize clock skew impacts on the slack.
During a SSTA, several factors make it difficult to compute the impact of the clock skew on a downstream slack. First, the test in question may receive a common path pessimism removal (CPPR) credit (i.e., an adjustment to test the slack in order to recover excess pessimism) for the physically common portion of the launch and capture paths in question. A proper analysis of clock skew therefore needs to factor out the portion of the capture/launch clock arrival time difference which is credited during CPPR analysis. Second, due to statistical root-sum-square (RSS) treatment of independently random delay along a path, the analysis of a complete launching and capture path pair is required in order to determine by how much the random delay specific to the non-common clock tree impacts downstream slack. Finally, in order to improve the turn-around-time efficiency, for diagnostic purposes, it is desirable to maintain clock skew measurements in-situ, i.e., within steps comprising existing SSTA sign-off flows.
Conventional techniques for computing the clock skew impact on slack fail to account properly for either CPPR credit in the common clock portion, or for the impact of RSS credit in the unique portion of a given launching and capturing path-pair, the latter being referred as the pair of launching and capturing paths meeting at a test. One such prior art technique (hereinafter referred to as “prior art technique #1”) simply involves computing clock skew impact on slack as the early versus the late arrival time difference among the launch and capture clock arrival times. This method implicitly assumes zero CPPR credit for the common clock tree and zero RSS credit for the random delay along the non-common clock tree. Therefore, generally, it produces a pessimistic bound on the clock skew impact on the slack.
Another technique (hereinafter referred to as “prior art technique #2) computes the clock skew impact on slack as the late vs. late (or early vs. early) arrival time difference among launch and capture clock arrival times. The method implicitly assumes full CPPR credit among launch and capture clocks, and therefore produces an optimistic bound on the actual impact of clock skew on the slack.
An example of the shortcomings of prior art techniques can be illustrated by considering a scenario where all random delays are zero (note that even though random delay is zero, early and late mean delays may still be different for the same circuit elements due to factors such as simultaneous switching, IR drop effects, coupling, or other systematic sources of variation).
Prior art technique #1 computes the clock skew impact on setup test slack as simply the early vs. late difference in launch and capture clock arrival times, namely,                DECOMMON=Common clock early delay        DLCOMMON=Common clock late delay        DECAPTURE=Unique clock capture path mean early delay        DLLAUNCH=Unique clock launch path mean late delayClock skew impact on slack (prior art technique #1)=EARLY_ATCAPTURE−LATE_ATLAUNCH=(DECOMMON+DECAPTURE)−(DLCOMMON−DLLAUNCH)=(DECOMMON−DLCOMMON)+(DECAPTURE−DLLAUNCH)        
Prior art technique #1 produces a pessimistic estimate for clock skew impact on slack by including the early vs. late difference in the common portion of the clock tree (for which CPPR should provide credit to the actual measured test slack).
Prior art technique #2 computes the clock skew impact on setup test slack as the difference between late vs. late capture/launch arrival times, i.e.,Clock skew impact on slack (prior art technique #2)=LATE_ATCAPTURE−LATE_ATLAUNCH=(DLCOMMON+DLCAPTURE)−(DLCOMMON−DLCAPTURE)=(DLLAUNCH−DLCAPTURE)
The aforementioned approach produces an optimistic estimate of clock skew impact on slack by ignoring variation effects (i.e., early vs. late delay differences) within the non-common portion of the clock tree. A practitioner skilled in the art will realize that an optimistic result would also be obtained by comparing early vs. early arrival times in a similar fashion.
Another example of the shortcomings of prior art techniques can be illustrated by considering an example wherein delay includes only an independently random component.
Applying “prior art technique #1” produces the following result for clock skew impact on setup test slack:                RECOMMON=Common clock early random delay        RLCOMMON=Common clock late random delay        RECAPTURE=Unique clock capture path early random delay        RLLAUNCH=Unique clock launch path late random delayClock skew impact on slack (prior art technique #1)=EARLY_ATCAPTURE−LATE_ATLAUNCH=(RECOMMON+RECAPTURE)−(RLCOMMON−RLLAUNCH)=(RECOMMON−RLCOMMON)+(RECAPTURE−RLLAUNCH)        
The aforementioned method again produces a pessimistic estimate for clock skew impact on slack by failing to include CPPR credit for the common clock path, and also failing to include RSS credit when additional random delay is present in the latch-to-latch data path.
Applying “prior art technique #2” produces the following estimate for clock skew impact on setup test slack:Clock skew impact on slack (prior art technique #2)=LATE_ATCAPTURE−LATE_ATLAUNCH=(RLCOMMON+RLCAPTURE)−(RLCOMMON−RLCAPTURE)=(RLLAUNCH−RLCAPTURE)
The above approach produces an incorrect result because it fails to account for early vs. late random delay differences in the non-common clock paths, and furthermore, it fails to account for RSS credit when additional random delay is present in the latch-to-latch data path. As one skilled in the art will readily observe, an incorrect result would also be obtained by comparing early vs. early arrival times in a similar fashion.
In order to address the statistical aspect of clock skew impact on slack, other prior art approaches to the aforementioned problem are described, for instance, in the paper entitled “Statistical Clock Skew Modeling with Data Delay Variations” to Harris and Naffziger et al., published in the IEEE Transactions on Very Large Scale Integration (VLSI) Systems, December 2001, in which a method for measuring clock skew is set forth wherein a Monte Carlo analysis is performed in order to simulate impact of skew on circuit performance across a wide range of process parameter settings. However, such approaches are very time consuming due to the need to perform Monte Carlo simulations, and are also not amenable to incremental re-analysis in response to design changes.