The yield of memory, custom digital, and other types of circuits is important because it directly affects the profitability of the chip on which the circuit in question is formed. Accordingly, it is important for designers to be able to estimate the yield of these circuits prior to their manufacture. As is known in the art, a failure rate is simply another unit related to yield. That is, the failure rate of a design is the proportion of sampled designs that fail specifications, whereas yield is the proportion of designs that pass.
To estimate failure rate of an electrical circuit design (ECD), there usually comes in play a model of statistical variation of some variables of the ECD. That model of variation can include a probability distribution of random variables. For example, each device in the ECD could have an n-dimensional Gaussian distribution describing variation in “n” process variables of that device such as oxide thickness, substrate doping concentration, etc. Then, the model of distribution (probability distribution) for the ECD is merely the union of the devices' distributions. Drawing a random point from the distribution, combined with the ECD's topology and device sizes (length, width, etc), provides an “instance” of the ECD, the instance being a model of a single chip (die) that might be manufactured (or a block or “cell” within the overall chip design).
The performance of an instance of an ECD is typically estimated via circuit simulation. Its performance can be estimated at various environmental points, for example at different temperatures. The instance of the ECD is “feasible” if the performances at each environmental point meet specifications. The performances, also referred to as performance metrics, can include, e.g., power consumption, read current, etc.
A simple, known way to estimate failure rate for a given ECD uses Monte Carlo sampling with simulation, as shown in FIG. 1. The inputs used at the start 102 include a representation of an ECD, and a probability distribution describing the variations that can affect the ECD. An instance—a combination of a ECD (nominal design), and a particular variation (due to process randomness)—can be simulated by a SPICE circuit simulator (SPICE: Simulation Program with Integrated Circuit Emphasis) or the like. In step 104, a number of instances are drawn from the ECD's probability distribution. Each instance is simulated 106, to determine whether it is “feasible” or not (i.e., if it meets, or not, pre-determined specifications of the performance metrics). The run typically stops 108 when all N samples have been simulated. The results are reported to the user display 110, including the failure rate, which is calculated as the ratio of the number of infeasible instances, over the total number of instances.
Estimating failure rate according to the Monte Carlo flow of FIG. 1 is relatively inexpensive when the probability of finding an infeasible instance (failure rate) is within the same order of magnitude or two, compared to the probability of finding a feasible instance. For example, if the failure rate (pf) is 0.10, then the yield is 0.90 and it is relatively inexpensive to estimate. To estimate failure rate with decent accuracy, a reasonable rule of thumb is to have enough samples to get about ten failures; and more samples will improve the accuracy further. In the example above, pf=0.10 leads to N=10/0.10=100 samples. On a modern CPU with modern simulation software, 100 Monte Carlo samples can be typically simulated in minutes to hours, which is reasonable. 1000 Monte Carlo samples can often be reasonable, and 10,000 samples for certain fast-simulating circuits can also be reasonable.
However, if the probability of an instance failing is much more rare, one needs, using the approach shown at FIG. 1, far more Monte Carlo samples to estimate the failure rate. For example, if pf=1.0 e-6 (1 in a million), then one would need about N=10/1 e-6=10 million Monte Carlo samples. In this case, simulating the ECD can be too computationally intensive for modern machines to obtain results in a reasonable time frame (e.g., hours). If, for a given circuit, pf=1.0 e-9 (1 in a billion), then one would need about N=10/1 e-9=10 billion Monte Carlo samples. Simulating on such a huge number of samples would clearly be unreasonable with respect to the required time frame.
Such low pf values are actually common in certain types of modern circuits. Among such circuits are memory circuits, where bitcells are repeated millions or billions of times (Mbit or Gbit memories) on a single chip; therefore each bitcell should preferably be extremely reliable (have a tiny pf) so that the overall memory has reasonable yield; and support circuitry such as sense amps, which are also repeated often, also need to be very reliable. Further, digital electronics have so many digital standard cells, that each cell should preferably be extremely reliable so that the overall circuit has decent yield.
Since simulating 10 million or 10 billion Monte Carlo samples is unreasonably expensive, other approaches to estimate failure rate have been explored.
One approach is to do a smaller number of Monte Carlo samples (10,000 to 1 million), simulate them, construct a model of the tail of the distribution, then to extrapolate the tail to find where the tail crosses the feasibility boundary (pass/fail boundary for a particular performance metrics). Unfortunately, this is very computationally expensive; and the extrapolation can be quite inaccurate.
Another approach is to construct an analytical model of the ECD, and to either draw a huge number of samples from that model, or derive the failure rate by analytically integrating the model. Unfortunately, this also can be very inaccurate. Further, this approach requires time-consuming tedious manual labor that must be repeated for every different circuit schematic, and possibly revised with every new manufacturing process node.
Another set of approaches is to use classification or regression models. The core idea is that models can evaluate a sample's feasibility far faster than simulation. One such approach (A. Singhee et al, “Method and apparatus for sampling and predicting rare events in complex electronic devices, circuits and systems”, U.S. patent application 20090248387 filed Mar. 28, 2008) draws Monte Carlo samples from the distribution, and uses a feasible/infeasible classifier in place of simulation when it has confidence in its prediction of feasibility. Another approach (J. Wang, S. Yaldiz, X. Li and L. Pileggi, “SRAM Parametric Failure Analysis,” Proc. ACM/IEEE Design Automation Conference, June 2009) adaptively builds a piecewise-linear model; it starts with a linear regression model and, at each iteration, chooses a higher-probability random point with known modeling error or uncertainty, simulates, and adds another “fold” to the model. A further approach (C. Gu and J. Roychowdhury, “An efficient, fully nonlinear, variability-aware non-Monte-Carlo yield estimation procedure with applications to SRAM cells and ring oscillators,” Proc. 2008 Asia and South Pacific Design Automation Conference, 2008, pp. 754-761) is similar to the previous, but uses a classification model rather than regression model. The general problem of model-based approaches is that one should be able to trust the model; if the model is inaccurate, then the results will be inaccurate. These approaches have only been demonstrated on tiny problems of just 6-12 variables; having a reliable model on 50 or 150 or more variables is far more difficult.
An additional approach uses Markov Chain Monte Carlo (MCMC). This approach is derived from the famous Metropolis-Hastings algorithm (N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, E. Teller, “Equations of State Calculations by Fast Computing Machines,” Journal of Chemical Physics 21 (6), 1953, pp. 1087-1092). In the MCMC approach for statistical sampling (Y. Kanoria, S. Mitra and A. Montanari, “Statistical Static Timing Analysis using Markov Chain Monte Carlo”, Proc. Design Automation and Test Europe, March 2010), the sampling distribution is adaptively tilted towards the rare infeasible events, and each subsequent sample in the “chain” of samples is used or rejected stochastically based on a threshold. Unfortunately, a stable “well-mixed” chain of MCMC samples is difficult to achieve reliably in practice, especially for non-experts in MCMC (i.e., tool users).
Another set of approaches uses importance sampling. A representative example is: R. Joshi et al, “System and Computer Program for Efficient Cell Failure Rate Estimation in Cell Arrays,” U.S. Patent Application Publication No. 2008/0195325, filed Apr. 16, 2008. In importance sampling, the distribution is shifted towards rare infeasible samples, just like MCMC. But unlike MCMC, importance sampling uses every sample. When estimating failure rate, it gives a weight to each sample according to its density on the sampling distribution, compared to its density on the true distribution. In the most promising importance sampling approaches for circuit analysis, “centers” are computed and subsequently used in importance sampling, where the centers are the means of Gaussian distributions. In the work by R. Joshi et al., centers are computed by drawing samples from a uniform distribution in the range of [−6, +6] standard deviations for each process parameter, and keeping the first 30 infeasible samples. The approach (M. Qazi, M. Tikekar, L. Dolecek, D. Shah, and A. Chandrakasan, “Loop Flattening & Spherical Sampling: Highly Efficient Model Reduction Techniques for SRAM Yield Analysis,” Proc. Design Automation and Test in Europe, March 2010) chooses centers via a spherical sampling technique. Both of these works were demonstrated on tiny problems of just 6-12 variables. Unfortunately, they work poorly in larger numbers of dimensions (random variables), because the chosen centers are too improbable; therefore the weights are too small to affect the failure rate estimate; causing the estimate to be far too optimistic, e.g., by reporting a pf of 1 e-200 when it should be around 1 e-8. In real-world circuit yield-estimation problems, there can be 100 or 1000 or more random variables, as such, Importance Sampling cannot be considered as a reasonable approach in estimating failure. Another disadvantage of Importance Sampling systems are the lack of transparency to a designer using such a tool—it is difficult for the designer to assess the nature of the altered distribution, and whether the distribution samples adequately along the feasibility boundary of highest probability.
Therefore, improvements in estimating failure rates in ECD's are desirable.