Designers of electrical circuits routinely face the task of optimizing performance of analog, mixed-signal and custom digital circuits, hereinafter referred to generally as electrical circuit designs (ECDs). In optimizing such ECDs, the designers aim to set device sizes on the ECUs such as to obtain optimum performances for one or more performance metric of the ECDs. In design-for-yield optimization, the designers aim to set the device sizes such that the maximum possible percentage of manufactured chips meets all performance specifications.
In general, a given ECD will have multiples goals in the form of a plurality of constraints and objectives. A simplifying assumption that can sometimes be made to optimize such ECDs is to assume that a measure of the quality of a given ECD can be reduced to a single cost function, which generally depends on the design variables and performance metrics of the ECD, and which can optimized by any suitable method. To arrive at a value of the cost function for a given design point usually requires multiple circuit simulations of the ECD.
One way for designers to choose circuit device sizes to minimize the related cost function is to use a software-based optimization tool 20 shown at FIG. 1. The optimization tool 20 includes a problem setup module 22 that includes particulars of the ECD to be optimized. The problem setup module 22 is connected to an optimizer (not shown), and is also connected to a simulation module 26. The particulars of the ECD can include a netlist of the ECD, performance metrics, design variables, process variables and environmental variables of the ECD. The problem setup module 22 also defines the steps to be followed to measure the performance metrics as a function of the ECD's several variables. The problem setup module 22 is in fact where the problem to be studied by the system 20 is setup.
The performance metrics can be a function of these various variables. The design variables can include, e.g., widths and lengths of devices of the ECD. The process variables can be related to random variations in the ECD manufacturing. The environmental variables can include, e.g., temperature, load conditions and power supply. The problem setup module 22 can also include further information about design variables, such as minimum and maximum values; additional environmental variables, such as a set of environmental points to be used as “corners” with which to simulate the ECD; and random variables, which can be in the form of a probability density function from which random sample values can be drawn.
As will be understood by the skilled worker, the procedure to be followed to measure the performance metrics can be in the form of circuit testbenches that are combined with the netlist to form an ultimate netlist. The ultimate netlist can be simulated by a simulation module 26, which is in communication with the problem setup module 22. The simulation module 26 can include, for example, one or more circuit simulators such as, for example, SPICE simulators. The simulation module 26 calculates waveforms for a plurality of candidate designs of the ECD. The waveforms are then processed to determine characteristic values of the ECD. As will be understood by the skilled worker, the plurality of candidate designs is a series of ECDs differing slightly from each other in the value of one or more of their variables.
The problem setup, as defined by the problem setup module 22, is acquired by the simulation module 26, which uses one or more simulators to simulate data for multiple candidate designs of the ECD. The simulation data is stored in a database 28 from where it can be accessed by a processor module 30, which can include, for example, a sampler or a characterizer. The processor module 30 is also in communication with the problem setup module 22. Given the problem setup, the sampler and/or characterizer can perform “sampling” and/or “characterization” respectively, of the ECD in question, by processing the simulation data, to produce characteristic data of the ECD. Based on this characteristic data, the processor module 30 can calculate one or more characteristic values of the ECD. During the course of a sampling or characterization, the database 28 can be populated with the characteristic data provided by the processor module 30 and with the one or more characteristic values of the ECD.
One of the characteristic values is that of the single cost function for a given ECD simulation. Other characteristic values can include, for example, a yield estimate for a given design point, histograms for each performance metric, 2D and 3D scatter plots with variables that include performance metrics, design variables, environmental variables, and process variables. A characteristic value can also represent, amongst others, the relative impact of design variables on yield, the relative impact of design variables on a given performance metric, the relative impact of all design variables vs. all process variables vs. all environmental variables, tradeoffs of yield vs. performances, and yield value for a sweep of a design variable.
The system 20 also includes a display module 32 and a user input module 34 that are used by the designer to set up the optimization problem, invoke an optimization run, and monitor progress and results that get reported back via the database 28. The processor module 30 selects the ECD's candidate designs that have the lowest single cost function values and displays these to the designer within finite time (e.g., overnight) and computer resource constraints (e.g., 5 CPUs available).
Optimization is a challenge due to the nature of the particular design problem. The time taken to compute/simulate/measure the value of the single cost function of a single ECD candidate design can take minutes or more. Therefore only a limited number of candidate designs can actually be examined given the resources at hand. The single cost function is usually a blackbox, which means that it is possibly non-convex, non-differentiable and possibly non-continuous. Consequently, this precludes the use of optimization algorithms that might take advantage of those properties. That is, it is not possible to use algorithms that exploit many simplifying assumptions.
Similar optimization problems exist in many fields other than electric circuit design. In fact, they exist in almost all engineering fields that have parameterizable design problems where simplifying assumptions cannot be made and that have means of estimating a design's cost functions such as with, e.g., a dynamical systems simulator. Such fields include, for example, automotive design, airfoil design, chemical process design and robotics design. As will be understood by the skilled worker, computing a cost function at a given design point, in generally any technical field, can actually include physical processes, such as running a physical robot or automatically performing laboratory experiments according to design points and measuring the results.
As is known in the art, a locally optimal design is one that has lower cost than all its immediate neighbors in design space, whereas a globally optimal design is one for which no other design in the whole design space has a lower cost function. As such, a blackbox optimization problem can be further classified into global or local optimization, depending on whether a globally optimal solution is desired (or at least targeted) or, a locally optimal solution is sufficient. As is also known, a convex mapping is one in which there is only one locally optimal design in the whole design space, and therefore it is also the globally optimal design. A nonconvex mapping means that there is more than one locally optimal design in the design space. Over the years, multiple global blackbox search algorithms have been developed, such as, for example, simulated annealing, evolutionary algorithms, tabu search, branch & bound, iterated local search and particle swarm optimization. Local search algorithms are also numerous and include, for example, Newton-method derivatives, gradient descent and derivative-free pattern search methods.
The challenge in designing optimization algorithms for such problems is to make the most use out of every cost function evaluation that has been made as the optimization progresses through its iterations. One type of optimization algorithms that does this are model-building optimization (MBO) algorithms. An MBO algorithm typically builds models based on candidate designs, also referred to as design points, and on their respective cost function values as they become available at various iterations of the optimization algorithm. That is, MBO algorithms use the set of of {design point, cost} tuples—and use regression-style modeling approaches to help choose the next candidate point(s).
A very notable MBO with a single overall cost function is the Efficient Global Optimization (EGO) algorithm: D. Jones, M. Schonleau, and W. Welch, “Efficient Global Optimization”, J. Global Optimization, vol. 13, pp. 455-592, 1998. More recently, variants with multiple cost functions have emerged too, e.g. J. Knowles, “ParEGO: a hybrid algorithm with on-line landscape approximation for expensive multiobjective optimization problems, IEEE Transactions on Evolutionary Computation, No. 1, February 2006, pp. 50-66.
An example of a pseudocode for a single cost function MBO algorithm can be written as:                (1) Generate initial set of sample vectors X={x1, x2, . . . }        (2) Measure scalar cost yi for each vector xi, e.g. by simulation, to get y={y1, y2, . . . }        (3) Build a surrogate model using the X=>y training data, which will be used as the surrogate cost function for candidate x's in the subsequent step        (4) Via an inner optimization, choose a new sample point xnew by optimizing across X according to an infill criterion        (5) Measure ynew=cost of xnew         (6) Add xnew to X, and ynew to y        (7) If termination criteria is hit, stop; else go to (3)        
The pseudocode of the above algorithm is represented graphically at FIG. 2 where reference numerals 40, 42, 44, 46, 48, 50 and 52 identify steps (1) through (7), respectively, of the above pseudocode.
Step (4), shown at reference numeral 46 is the core of the algorithm and must be such that repeated iterations will make the search converge towards a globally optimal solution and, ideally, on the steps made towards attaining the optimal solution, will make continual improvements in the cost function. This can be challenging. If step (4)'s inner optimizer were to blindly minimize the surrogate cost function, it would zero in on the model's perceived-good regions, and ignore other regions that are “blind” to the current model but could be potentially far better. Accordingly, there must be some balance of exploration (learning more about unknown regions) and exploitation (taking further advantage of known good regions). Uncertainty is a formal way for the model to discern blind spots vs regions that are well-understood by the model. An infill criterion is a function that balances exploration vs. exploitation i.e., that balances surrogate's uncertainty with surrogate's estimate of cost.
FIG. 3 illustrates the behavior of the algorithm shown at FIG. 2 for a one variable design, the variable being X. FIG. 3 shows the candidate design cost for ten candidate designs as a function of X. These candidate designs 61, also referred to as training data, are shown as diamonds. In this example, the algorithm of FIG. 2 aims to maximize the cost as a function of X. FIG. 3 illustrates the state of an optimization after proceeding through the above steps (1)-(4) the first time. At step (1), the algorithm generated 10 sample vectors in x-space. At step (2), it measured the true scalar cost for each vector. The {x value, cost function} tuples are illustrated by the diamonds. These form the training data for the surrogate regression model. At step (3), a surrogate model is built, which is represented by the line 6 which goes through all the training points, having just some curvature near the middle X region. At step (4), a new value of x must be chosen which maximizes the infill criterion, which combines the model's estimate of the cost function and the estimate of uncertainty (to maximize). The plot 62 that looks like a mix of large and small pyramids with edges at the training data 61 illustrates the infill criterion, which, in this case, is a weighted sum of the cost function and uncertainty. Uncertainty for this example is taken as being the scaled distance to the nearest training point. Therefore, at X values which have a training point, there is an uncertainty of zero and the infill criterion value is equal to the objective function value. As the X values go away from the training samples the value of uncertainty goes progressively higher. Step (4)'s inner optimization maximizes the infill criterion, and finds the corresponding value “X_guess” of about 1.5.
FIG. 4 shows the next phase of the search, covering steps (5)-(7) and the next round of steps (3)-(4). At step (5), the true scalar cost for X_guess is measured, to get yxguess. At step (6), the new xguess and yxguess are added to the existing X=>y training data. It is to be noted that the single new diamond point 63 is shown at FIG. 4. At step (7), assuming that no stopping criteria has been hit the algorithm is looped back to step (3) which constructs a new candidate design it is to be noted that the surrogate model of cost 64 at FIG. 4 is substantially different from the model 60 at FIG. 3, particularly around x=1.5. Initially, the surrogate model 60 went gradually from about y=0.0 to y=0.4 as x goes from 0 to 3.5, now, at about x=1.5, the model 64 goes to y=0.6 during its transition from y=0.0 to y=0.4. The model 60, as shown at FIG. 4, reflects the true underlying data more accurately at x=1.5, whereas the model 60, as shown at FIG. 3, effectively had a “blind spot” at x=1.5. The MBO algorithm was able to simulate at about x=1.5 to uncover the truth behind at this blind spot. Had it ignored uncertainty and only went for the estimate of the cost, the algorithm would have made an X_guess somewhere in x>2 where cost is maximized, and would have never found the improved designs in the region about x=1.5. To summarize this example of a prior art optimization approach, we see that the new infill function has 0 uncertainty at x=1.5, and now the most optimal point for the new X_guess is at a different region of x, near x=2.5.
Generally, an MBO algorithm requires choices of: (a) a surrogate model which can output uncertainty, (b) an “infill criterion” which robustly balances exploitation vs. exploration, and (c) an inner optimization algorithm. The difficulty in making these choices is examined below.
The first choice that must be made at step (4) above is the choice of a regressor. As noted, the regressor, also referred to as a regression model, must be able to output cost and uncertainty in cost. To meet these constraints, the EGO algorithm referred to above uses a Gaussian process model, which is also known as a kriging model or a DACE (Design Automation of Computer Experiments) model. There are other options for choice of regression models, but they must all have some means of providing the uncertainty of their estimate. Alternatively, one can use a means independent of the model, such as making uncertainty proportional to the distance to the closest point (which has its own disadvantages, such as blindness to the curvature of a mapping for a given region). The regressor must also possess other properties to make it effective. Such properties include that the regressor should be able to handle a small (e.g., 10) or large (e.g., 10,000) number of training samples; handle any given number of input dimensions which is the number of design variables and which can be equal to any number (e.g., 1 or 10 or 100 or 1000 or even more); capture nonlinearities such as discontinuities and nonconvex mappings; be reliable enough to be within an automated loop; be built fast enough to be within the loop; and be simulated fast enough to be within the loop. Meeting all these criteria plus supplying an uncertainty estimate can be a big challenge for regressors. Most notably, the Gaussian process model in the EGO approach is known to have very poor scaling properties, doing poorly for more than 10 or 15 input dimensions and for more than about 100 training samples.
The second choice that must be made at step (4) above is that of the infill criterion, which is a function to balance the measures of uncertainty (exploration) with surrogate cost function (exploitation) into a single inner optimization cost function. The EGO algorithm maximizes “expected improvement in cost function.” Other options in the literature have been proposed. One such notable option is to minimize “least constrained bounds” (LCB) which is essentially a weighted sum of cost and negative uncertainty, that is: minimize [wcost*surrogate_cost(x)−wuncertainty*uncertainty(x)], where Wcost and Wuncertainty are the weights attributed to the surrogate cost and the uncertainty respectively.
The third choice that must be made at step (4) above is choice of inner optimization algorithm, i.e., the algorithm which traverses the space of possible X's to maximize/minimize the infill criterion. The algorithm will typically aim to be as global as possible, and will have access to a large number of “surrogate cost function evaluations”. However, since such evaluations each take non-negligible computer time, the algorithm choice makes a difference in that it only has a limited number of evaluations to get the best-possible solution. The EGO algorithm itself uses branch-and-bound for an exact guess, but that can get very expensive with a larger number of dimensions. Other algorithm choices can include, for example, iterated local search, evolutionary algorithms and tabu search.
Compared to many other algorithms, MBO algorithms, such as the EGO algorithm, have been demonstrated to be highly efficient in optimization of certain classes of problems. The class of problems in which it excels has lower dimensionality (e.g., 1-10 variables in X space), and are smoother such that the Gaussian process model fits it better.
The termination criteria of step (7) above for the algorithm are flexible. They can include such criteria as: stop if number of designs explored is greater than a pre-determined threshold; stop if overall runtime is greater than a pre-determined maximum runtime; stop if best improvement over last few designs is below a pre-determined improvement threshold; stop if the maximum uncertainty in the whole design space is less than a pre-determined uncertainty threshold; and so on.
MBO algorithms have another limit in that the number of surrogate cost evaluations at step (4) is constrained. While the computational cost of evaluating surrogate cost is much cheaper than true cost, it is certainly not free. This means that computational cost of step (4) can be quite large, because there are potentially many evaluations of surrogate cost. Furthermore, the model itself takes time to build at step (3). A reasonable rule of thumb is to keep the time for steps (3) and (4) roughly less than or equal to the time for a true cost evaluation. In other words, the order of magnitude of time to build model plus the inner optimization time must be less than or equal to time to compute/measure the true cost. Since it's in orders of magnitude, it also means that inner optimization time must be less than or equal to time to compute/measure true cost. By reasonably assuming that the dominant component of inner optimization time is the evaluation of surrogate cost and that the time for evaluation of surrogate uncertainty comes for free when we evaluate surrogate cost, the inner optimization time is given by (number of inner designs) * (time to evaluate surrogate cost). Therefore, (number of inner designs allowed)=(time to compute true cost)/(time to evaluate surrogate cost). That ratio is how much faster the surrogate cost function is compared to the true cost function. It's typically on the order of 100 to 100,000, which means that the inner optimization algorithm can evaluate 100 to 100,000 design candidates.
The convergence rate and results returned by MBO algorithms are highly sensitive to the choice of infill criterion, both directly and indirectly. Directly, because some correlate better than others for getting to a global optima. Indirectly, because they cause the structure of the cost function to be vastly different, affecting the searchability of the function, which is critical when there is a limited budget of 100 to 100,000 design candidates. For example, the EGO algorithm use of “expected improvement” turns out to have large, vast plateaus in the space, punctuated by tall spikes. Much search effort can be expended wandering through the plateaus until a tall spike is found, and as soon as any spike is found, the inner optimizer may end up quickly diverting all its search effort to that, ignoring other possibly far higher tall spikes. This effect gets even worse with higher dimensionality. The LCB infill criterion has a weighted sum of cost and uncertainty, but that means that both cost and uncertainty must be scaled to be in approximately the same range, which can sometimes be a challenge. Even if that is solved, a larger challenge is how to choose the weight for cost vs. uncertainty, because that involves making an exact choice for how much exploration is desired vs. how much exploitation.
The current EGO algorithm and other MBO variants have not been demonstrated on more than about 15 design variables, because they cannot effectively choose the next design point(s) without having overwhelming computational effort compared to effort for estimating true cost. This is a giant disadvantage for applicability of such algorithms to larger problems, which may have 25, 50, 100 or even 1000 design variables; such problems are common in electrical circuit design and in other fields. Specific issues that cause this disadvantage include the inner optimization's need to balance exploration with exploitation with respect to an infill criterion; the inner optimization algorithm's need to be efficient enough to explore and get reasonable results in a limited number of samples; and the regression's need to meet scalability and speed goals yet still provide an estimate of uncertainty.
Therefore, it is desirable to provide method and system for multi-parameter design optimization that can effectively handle a large number of design variables.