For integrated circuits (e.g., VLSI chips) to work properly, the signals traveling along their gates and interconnects must be properly timed, and several factors are known to cause timing variations. As examples, variations in manufacturing process parameters (such as variations in interconnect diameter, gate quality, etc.) can cause timing parameters to deviate from their designed value. In low-power applications, lower supply voltages can cause increased susceptibility to noise and increased timing variations. Densely integrated elements and non-ideal on-chip power dissipation can cause “hot spots” on a chip, which can also cause excessive timing variations.
A classical approach to timing analysis is to analyze each signal path in a circuit and determine the worst case timing. However, this approach produces timing predictions that are often too pessimistic and grossly conservative. As a result, statistical timing analysis (STA, also referred to as statistical static timing analysis or SSTA)—which characterizes timing delays as statistical random variables—is often used to obtain more realistic timing predictions. By modeling each individual delay as a random variable, the accumulated delays over each path of the circuit will be represented by a statistical distribution. As a result, circuit designers can design and optimize chips in accordance with acceptable likelihoods rather than worst-case scenarios.
In STA, a circuit is modeled by a directed acyclic graph (DAG) known as a timing graph wherein each delay source—either a logic gate or an interconnect—is represented as a node. Each node connects to other nodes through input and output edges. Nodes and edges are referred to as delay elements. Each node has a node delay, that is, a delay incurred in the corresponding logic gates or interconnect segments. Similarly, each edge has an edge delay, a term of signal arrival time which represents the cumulative timing delays up to and including the node that feeds into the edge. Each edge delay has a path history: the set of node delays through which a signal travels before arriving at this edge. Each delay element is then modeled as a random variable, which is characterized by its probability density function (pdf) and cumulative distribution function (cdf). The purpose of STA is then to estimate the edge delay distribution at the output(s) of a circuit based on (known or assumed) internal node delay distributions.
The three primary approaches to STA are Monte Carlo simulation, path-based STA, and block-based STA. As its name implies, Monte Carlo simulation mechanically computes the statistical distribution of edge delays by analyzing all (or most) possible scenarios for the internal node delays. While this will generally yield an accurate timing distribution, it is computationally extremely time-consuming, and is therefore often impractical to use.
Path-based STA attempts to identify some subset of paths (i.e., series of nodes and edges) whose time constraints are statistically critical. Unfortunately, path-based STA has a computational complexity that grows exponentially with the circuit size, and thus it too is difficult to practically apply to many modern circuits.
Block-based STA, which has largely been developed owing to the shortcomings of Monte Carlo and path-based STA, uses progressive computation: statistical timing analysis is performed block by block in the forward direction in the circuit timing graph without looking back at the path history, by use of only an ADD operation and a MAX operation:
ADD: When an input edge delay X propagates through a node delay Y, the output edge delay will be Z=X+Y.
MAX: When two edge delays X and Y merge in a node, a new edge delay Z=MAX(X,Y) will be formulated before the node delay is added.
Note that the MAX operation can also be modeled as a MIN operation, since MIN(X,Y)=−MAX(−X,−Y). Thus, while a MIN operation can also be relevant in STA analysis, it is often simpler to use only one of the MAX and MIN operators. For sake of simplicity, throughout this document, the MAX operator will be used, with the understanding that the same results can be adapted to the MIN operator.
With the two operators ADD and MAX, the computational complexity of block based STA grows linearly (rather than exponentially) with respect to the circuit size, which generally results in manageable computations. The computations are further accelerated by assuming that all timing variables in a circuits follow the Gaussian (normal) distribution: since a linear combination of normally distributed variables is also normally distributed, the correlation relations between the delays along a circuit path are efficiently preserved.
To illustrate, in the ADD operation ADD(X,Y)=Z, if both input delay elements X and Y are Gaussian random variables, then the delay Z=X+Y will also be a Gaussian random variable whose mean and variance are:Mean: μZ=μX+μY  (1)Variance: σZ2=σX2+σY2+2cov(X,Y)  (2)Where cov(X,Y)=E{(X−μX)(Y−μY)} is the covariance between X and Y.
In contrast, in the MAX operation Z=MAX(X,Y), MAX is a nonlinear operator: even if the input delays X and Y are Gaussian random variables, Z will not (usually) have a Gaussian distribution. However, as shown in C. Clark, “The greatest of a finite set of random variables,” Operations Research, pp. 145-162, March 1961, if X and Y are Gaussian and statistically independent, the first and second moments of the distribution of MAX(X,Y) are defined by:Mean: μZ=μX·Q+μY(1−Q)+θP  (3)Variance:σZ2=(μX2+σX2)Q+(μY2+σY2)(1−Q)+(μX+μY)θP−μZ2  (4)where θ=σ(X-Y). P and Q are the pdf and cdf of the standard Gaussian distribution evaluated at λ=μ(X-Y)/σ(X-Y):
                                          P            ⁡                          (              λ              )                                =                                    1                                                2                  ⁢                  π                                                      ⁢                          exp              (                              -                                                      λ                    2                                    2                                            )                                      ⁢                                  ⁢                              Q            ⁡                          (              λ              )                                =                                    ∫                              -                ∞                            λ                        ⁢                                          P                ⁡                                  (                  x                  )                                            ⁢                              ⅆ                x                                                                        (        5        )            It is then possible to define a Gaussian approximation for the non-Gaussian Z=MAX(X,Y). In C. Visweswariah, K. Ravindran, and K. Kalafala, “First-order parameterized block-based statistical timing analysis,” TAU'04, February 2004, the Z=MAX(X,Y) is approximated by a Gaussian random variable  which is a linear combination of X, Y, and an additional independent Gaussian random variable Δ:Z=MAX(X,Y)≈QX+(1−Q)Y+Δ=  (6)where Q is defined in the foregoing Equation (5), and is referred to as “tightness.” The purpose of the additional random variable Δ is to ensure that the first and second moments (the mean and the variance) of  match those of Z as specified in the foregoing Equations (3) and (4).
In the foregoing Clark reference, it was shown that if W is a Gaussian random variable, then the cross-covariance between W and Z=MAX(X,Y) can be found analytically as:cov(W,Z)=Qcov(W,X)+(1−Q)cov(W,Y)  (7)Substituting Equation (6):cov(W,)=Qcov(W,X)+(1−Q)cov(W,Y)=cov(W,Z)Hence, a convenient property of the approximator  is that the cross-covariance between Z and another timing variable W is preserved when the non-Gaussian Z=MAX(X,Y) is replaced by the Gaussian random variable . Thus, the use of the Gaussian random variable  as an approximation to the non-Gaussian Z=MAX(X,Y) allows preservation of linearity.
Unfortunately, one flaw of block-based STA is that its underlying assumption of a simple linear (additive) combination of sequential path delays is often incorrect. The delays of elements in a circuit can be correlated due to various phenomena, two common ones being known as global variations and path reconvergence. Global variations are effects that impact a number of elements simultaneously, such as inter- or intra-die spatial correlations (gate channel length variations, wire geometry variations, etc.), temperature or supply voltage fluctuations, etc. These generate global correlations between delay elements, wherein all globally correlated elements are simultaneously affected. An example of the effect of global variations is schematically depicted in FIG. 1(a), wherein node delays X, Y, and Z all depend on some influence g.
Path reconvergence occurs where elements share a common element or path along their past path histories owing to path intersections, and this leads to path correlation (local correlation of elements along some section of a path). An example of the effect of path correlation is schematically depicted in FIG. 1(b), wherein edge delays X and Y both depend on node delay p.
The underlying problem of global and path correlation is that while the output of the MAX operator can be directly approximated by a Gaussian distribution having its first two moments matching those of Equations (3) and (4), this approach fails to retain any correlation information after the MAX operation is performed. In short, the MAX operator destroys correlation information which may be critical to accurate timing prediction. Several approaches have been proposed for dealing with global and path correlation, but the field of timing analysis is lacking in methods for accounting for both of these correlations in an accurate and computationally efficient manner.
One approach to compensating for global variations is to use a canonical timing model (C. Visweswariah, K. Ravindran, and K. Kalafala, “First-order parameterized block-based statistical timing analysis,” TAU'04, February 2004; A. Agarwal, D. Blaauw, and V. Zolotov, “Statistical timing analysis for intra-die process variations with spatial correlations,” Computer Aided Design, 2003 International Conference on. ICCAD-2003, pp. 900-907, November 2003; H. Chang and S. S. Sapatnekar, “Statistical timing analysis considering spatial correlations using a single pert-like traversal,” ICCAD'03, pp. 621-625, November 2003). In the canonical timing model, each of the node delays is represented as a first order (linear) summation of three terms:
                              n          i                =                              μ            i                    +                                    α              i                        ⁢                          R              i                                +                                    ∑                              j                =                1                                      ⁢                                          β                                  i                  ,                  j                                            ⁢                              G                j                                                                        (        8        )            where ni (i=1, 2, . . . ) is the random variable representing the ith node delay in the timing graph; μi is the expected or nominal value of ni; Ri, is the local variation (also called node variation), a zero-mean, unity variance Gaussian random variable representing the localized statistical uncertainties of ni; Gj represents the jth global variation, and is also modeled as a zero-mean, unity variance Gaussian random variable; {Ri} and {Gj} are additionally assumed to be mutually independent; and the weight parameters αi (named node sensitivity or local sensitivity) and βi,j (named global sensitivity) are deterministic constants, explicitly expressing the amount of dependence of ni on each of the corresponding independent random variables.
With this canonical representation, the variance of a node delay ni and its correlation (covariance) with another node delay nk can be evaluated as:
Variance:
                              σ                      n            i                    2                =                              E            ⁢                          {                                                (                                                            n                      i                                        -                                          μ                      i                                                        )                                2                            }                                =                                    α              i              2                        +                                          ∑                j                            ⁢                              β                                  i                  ,                  j                                2                                                                        (        9        )            Covariance:
                              cov          ⁡                      (                                          n                i                            ,                              n                k                                      )                          =                              E            ⁢                          {                                                (                                                            n                      i                                        -                                          μ                      i                                                        )                                ⁢                                  (                                                            n                      k                                        -                                          μ                      k                                                        )                                            }                                =                                    ∑              j                        ⁢                                          β                                  i                  ,                  j                                            ⁢                              β                                  k                  ,                  j                                                                                        (        10        )            The linear/first order canonical timing model (8) provides an elegant way to deal with the correlated gate/wire delays arising from global variations, as discussed in the references noted above. The delay computed from the canonical timing model (8) will be Gaussian since it is a linear combination of Gaussian random variables, which may be acceptable for cases when the variation is small and the nonlinear relationship between the gate/wire delay and the global variation sources is not significant. However, when the variation becomes larger (e.g., as circuit technology scales down to nanometer levels), the nonlinearity of the gate/wire delay as a function of the global variations will be more and more significant, and cannot be accurately approximated by the linear canonical timing model. In these cases, even where the global variations Gj, local variations Ri, etc. are modeled as Gaussian random variables, the gate/wire delays ni are generally not Gaussian-distributed random variables. Thus, there is a need for a timing model which more accurately represents the nonlinear relationship between the gate/wire delay and the global variation sources, and will therefore yield more accurate STA for smaller and/or higher-speed deep-sub-micron IC circuits, where relative magnitudes of global variations are often larger. Given that the trend in circuit fabrication is to ever-increasing speed and ever-decreasing size, there is clearly a pressing need for accurate methods of statistical timing analysis which compensate for both global and path correlation, and which are computationally efficient so that rapid design and testing is feasible.