1. Field of the Invention
The present invention generally relates to the analysis of failure conditions in complex systems, and more specifically to a method of computing the failure probability for a system having multiple failure modes, particularly electrical circuits such as memory cells.
2. Description of the Related Art
Integrated circuits are used for a wide variety of electronic applications, from simple devices such as wristwatches to the most complex computer systems. Although great care is taken in the design and fabrication of integrated circuits, there is still a small percentage of electrical components that can fail for various reasons including process variations, defective designs or incomplete testing. Even if the percentage of failing components is very small, it may still equate to a significant number of absolute failures when considering components having a very large quantity of circuit elements. For example, an integrated circuit (IC) chip for a state-of-the-art static random-access memory (SRAM) array may have millions of memory cells (bits). Fails are rare in such memory designs but, unlike logic circuitry, a single or a few failing memory cells can lead to significant yield loss.
Means have been devised to mitigate the effects of component failures, such as the provision of error-correcting circuits or redundant circuits which enable recovery for a limited number of fails. However, with designers aiming for less than one part-per-million fails in memory designs, it is increasingly important to understand the failure mechanisms, and take into consideration the impact of process variation parameters on yield and design considerations. This challenge is becoming more difficult as process technology scales to the deep-submicron regime.
In the case of memory circuits, designers are particularly interested in process variation within the transistors of the memory cells. For example, variability and mismatch between these devices can lead to fails. Traditional sensitivity analysis techniques such as FORM (first order reliability method) are useful in estimating the probability of failure in memory systems when fails are attributed to a single failure mode (single-fail regions), but these techniques become problematic for more complicated systems with multiple failure modes (multi-fail regions).
Convex hull (or convex envelope) analysis uses a set of points in the space defined by the parameters of interest to construct a closed fail boundary for the system. While convex hull analysis is straightforward for two dimensions, it is computationally expensive to construct the envelope in higher dimensions, and further requires additional numerical integration of the variable distribution across the resulting structure.
The inscribed ellipsoid technique computes the dimensions of an ellipsoid of maximum volume which is bounded by failure sample points in the parametric space, but these computations also require computation of the hull in which the ellipsoid is inscribed. Inscribed ellipsoid further requires additional optimizations to be constructed after samples are available.
The FORM approach can be used to efficiently calculate the probability of failure Pf for a system having a single failure mode. Failure is defined as a limit state function of the system variables which exceeds a given value. For electrical circuits a failing value may be established using circuit simulation tools. A generalized example of FORM is shown in the graph of FIG. 1A which represents a parametric space based on two threshold voltages VTN1 and VTN2 for respective devices in a memory cell. The center of the graph represents a nominal point of the limit state function corresponding to the mean values (μ1, μ2) of the two voltage thresholds, which are assumed to have Gaussian distributions. In this example the failure mode is associated with a very high VTN1 (around +5σ1) and a very low VTN2 (around −5σ2). FORM computes a failure direction by locating the closest failing point 2 to the nominal point, and calculates a fail boundary as the line normal to this direction which also passes through the closest failing point 2. The hatched portion 4 in the upper right hand corner of FIG. 1A thus represents the single-fail region. The probability of failure according to this model is computed as the integral of the distribution function over this region. For a linear fail region boundary which is orthogonal to the closest failing point, the probably of fail can be estimated from the normalized distance to the closest failing point without need for integration. In a variation on this technique known as SORM (second order reliability method), a nonlinear (parabolic) fail boundary is computed to fit the curvature of a set of closest failing points.
The problem with using FORM (or SORM) is that more complicated systems cannot be accurately represented by one simple fail boundary. Circuits such as memory cells can suffer multiple failure modes which would be modeled using multiple failure directions, as illustrated in FIG. 1B. For multiple directions, FORM estimates may have overlapping fail regions 6, 8, and as a result of this overlap it becomes impossible to use direct formulations for the failure probability. Instead, more expensive methods must be employed to integrate the pass/fail regions such as Monte Carlo. This problem becomes even more difficult as the number of variables increases and also as the number of fail boundaries that represent the system increases. The graph of FIG. 1B is two-dimensional, i.e., two variables, but accurate representation of a circuit may require a significantly higher number of variables. For example, it would be preferable to model at least six different variables for an SRAM cell corresponding to threshold voltages for the six different transistors that comprise the cell.
In light of the foregoing, it would be desirable to devise an improved method for estimating the failure probability in systems having multi-fail regions which did not require excessive computation. It would be further advantageous if the method could easily handle higher dimensions of the parametric space, i.e., many different process variables.