1. Field of the Invention
The present invention relates to the field of electronic design automation (EDA). More specifically, the present invention relates to techniques for cell placement and other optimizations used in the design and fabrication of integrated circuit devices.
2. Related Art
An electronic design automation (EDA) system is a computer software system used for designing integrated circuit (IC) devices. The EDA system typically receives one or more high level behavioral descriptions of an IC device (e.g., in HDL languages like VHDL, Verilog, etc.), as represented by HDL 12 of FIG. 1, and translates this high level design language description into netlists of various levels of abstraction. At a higher level of abstraction, a generic netlist is typically produced based on technology independent primitives. The generic netlist can be translated into a lower level technology-specific netlist based on a technology-specific library that has gate-specific models for timing and power estimation. A netlist describes the IC design and is composed of nodes (elements) and edges, e.g., connections between nodes, and can be represented using a directed cyclic graph structure having nodes which are connected to each other with signal lines. A single node can have multiple fan-ins and multiple fan-outs. The netlist is typically stored in computer readable media within the EDA system and processed and verified using many well known techniques. One result is a physical device layout in mask form which can be used to directly implement structures in silicon to realize the physical IC device, step 28 of FIG. 1.
The rapid growth of the complexity of modem electronic circuits has forced electronic circuit designers to rely upon computer programs to assist or automate most steps of the design process. Typical circuits today contain hundreds of thousands or millions of individual pieces or xe2x80x9ccells.xe2x80x9d Such a design is much too large for a circuit designer or even an engineering team of designers to manage effectively manually.
FIG. 1 shows a typical system 10 of computer programs used to automate the design of electronic circuits. Within system 10, the designer first produces a high-level description 12 of the circuit in a hardware description language such as Verilog or VHDL. Then this high-level description 12 is converted into a netlist 16a using a computer implemented synthesis process 14 such as a the xe2x80x9cDesign Compilerxe2x80x9d by Synopsys of Mountain View, Calif. A netlist 16a is a description of the electronic circuit which specifies what cells compose the circuit and which pins of which cells are to be connected together using wires (xe2x80x9cnetsxe2x80x9d). Importantly, the netlist 16a does not specify where on a circuit board or silicon chip the cells are placed or where the wires run which connect them together. Determining this geometric information is the function of an automatic placement process 18 and an automatic routing process 22, both of which are shown in FIG. 1 and are typically computer programs.
Next, the designer supplies the netlist 16a into the computer implemented automatic cell placement process 18 of FIG. 1. The automatic placement computer program 18 finds a location for each cell on a circuit board or silicon chip. The locations are specified, typically, in two dimensional spatial coordinates, e.g., (x, y) coordinates, on the circuit board or silicon chip. The locations are typically selected to optimize certain objectives such as wire length, wire routibility, circuit speed, circuit power consumption, and/or other criteria, subject to the condition that the cells are spread evenly over the circuit board or silicon chip and that the cells do not overlap with each other. The output of the automatic cell placement process 18 includes a data structure 20 including the (x, y) position for each cell of the IC design. In some cases, the netlist 16a is modified and a new netlist 16b is generated. In other cases, the netlist 16b is the same as netlist 16a. 
Next, the designer supplies the netlist 16a and the cell location data structure 20, generated by the placement program 18, to a computer implemented automatic wire routing process 22. This computer program 22 generates wire geometry within data structure 24. The wire geometry data structure 24 and cell placement data structure 20 together are used to make the final geometric database needed for fabrication of the circuit as shown by process 28.
FIG. 2 illustrates the subprocesses of the automatic placement process 18 in more detail. Typically, placement is done in 2 steps including a first coarse placement process 30, then detailed a placement process 34. The coarse placement process 30 finds approximate cell locations which optimize the desired metrics and spreads cells evenly across the silicon chip or circuit board. In the output data structure 32, some cells still overlap and no cells are in legal site locations, so the coarse placement 30 needs to be legalized before the circuit can be fabricated. The detailed placement 34 inputs the data structure 32 output by the coarse placement 32 and generates the detailed placement 20, discussed in FIG. 1, which does not have overlap and all are located on legal sites.
FIG. 3A shows an example coarse placement 40a including the positions of cells A-I. The detailed placer process 34 changes cell locations a small amount in order to make the placement able to be manufactured. Generally, it is best to move cells as small a distance as possible to avoid disturbing the coarse placement result. Some detailed placer programs also attempt to further optimize the metrics used to drive coarse placement. An example detailed placement 40b is shown. in FIG. 3B illustrating the new locations of cells A-I. As shown, the cells A-I in FIG. 3B are more aligned along the horizontal rows.
Other prior art methods of finding coarse placement are described. The early methods of performing coarse placement used some variety of simulated annealing as described in a journal article entitled, xe2x80x9cOptimization by Simulated Annealing,xe2x80x9d by S. Kirkpatrick, C. Gelatt, M. Vecchi, which appeared in xe2x80x9cScience,xe2x80x9d May 13, 1983, Volume 220, Number 4598, on pages 671-680. Simulated annealing is a general method of finding good solutions to complex combinatorial optimization problems with a wide variety objective functions. Simulated annealing works by proposing random small changes to the current placement and accepting the changes with probability as shown by equation (1) below:
probability of acceptance=1, if xcex94objective isxe2x89xa6to 0 exe2x88x92(xcex94objective/temperature), if xcex94objective greater than 0xe2x80x83xe2x80x83(1)
In operation, changes that are not accepted are rejected, and undone. A control parameter xe2x80x9ctemperaturexe2x80x9d is used to control the number of acceptances. Temperature starts high and decreases toward 0 as the optimization proceeds. The optimization concludes when the temperature reaches 0.
Simulated annealing has many advantages. Simulated annealing based placement algorithms can combine both coarse and detailed placement in a single step. It can be shown (e.g., in a conference paper entitled, xe2x80x9cConvergence of the Annealing Algorithm,xe2x80x9d by M. Lundy and A. Mees, which appeared in xe2x80x9cProceedings of the Simulated Annealing Workshop,xe2x80x9d 1984) that simulated annealing converges to an optimal solution with probability 1, if run for a long enough amount of time. Simulated annealing can optimize a wide variety of objective functions.
However, simulated annealing has more recently lost favor as a method of coarse placement because of the following disadvantages. First, for circuits larger than a few thousand cells, simulated annealing does not achieve a good result unless run for a prohibitively long amount of time. Circuits today have hundreds of thousands or millions of cells. Second, for various technical reasons, simulated annealing is not able to optimize circuit timing very effectively. As circuit sizes have grown and geometries have decreased in size, circuit timing has become progressively more important in determining a good placement. Simulated annealing does continue to be a competitive method for detailed placement.
The shortcomings of simulated annealing for coarse placement have motivated the development of quadratic-partition algorithms for coarse placement as described in a journal article entitled, xe2x80x9cGORDIAN: VLSI Placement by Quadratic Programming and Slicing Optimization,xe2x80x9d by Jurgen Kleinhans, Georg Sigl, Frank Johannes, Kurt Antreich, which appeared in xe2x80x9cIEEE Transactions on Computer-Aided Design,xe2x80x9d Vol. 10, No. 3, March 1991 on pages 356-365 and a conference paper entitled, xe2x80x9cPROUD: A Fast Sea-Of-Gates Placement Algorithm,xe2x80x9d by Ren-Song Tsay, Ernest Kuh, Chi-Ping Hsu, which appeared in xe2x80x9cProceedings of 25th ACM/IEEE Design Automation Conference,xe2x80x9d 1988, paper 22.3, on pages 318-322.
The below pseudo code shows an outline of a quadratic-partition coarse placement program.
This prior art process is based on the observation that the problem of placing cells to minimize sum-of-squared wire lengths can be solved quickly and exactly for large problems using standard techniques. Partitioning is used to guarantee that the cells are spread evenly across the silicon chip or circuit board.
Equation (2) gives a formula for the sum-of-squared wire length of a cell placement.                               total_wire          ⁢          _len                =                              ∑                          (                              i                ,                j                            )                                ⁢                      xe2x80x83                    ⁢                                    weight              ij                        ⁡                          [                                                                    (                                                                                            cell                          i                                                ⁢                        x                                            -                                                                        cell                          j                                                ⁢                        x                                                              )                                    2                                +                                                      (                                                                                            cell                          i                                                ⁢                        y                                            -                                                                        cell                          j                                                ⁢                        y                                                              )                                    2                                            ]                                                          (        2        )            
where cell i and cell j have connected pins
Equation (2) for total wire length is a quadratic form with a positive-semi-definite Hessian. Therefore, the total squared wire length can be minimized directly, or differentiated with respect to cell locations, and then solved as a system of linear equations using standard techniques. There is a requirement that some cells be in fixed locations (for example, the circuit""s input/output ports can be fixed on the periphery of the silicon chip); otherwise, the optimal placement is to place all cells in exactly the same location and have all wires zero length.
However, minimizing total squared wire length does not spread cell area evenly across the chip. Partitioning accomplishes this. FIGS. 4A-4C illustrate an example of using partitioning to spread cell area evenly across a rectangle. FIG. 4A shows one solution within chip boundary 42 to the minimum sum-of-squared wire length problem for cells A-D. Cell area is heavily skewed toward the left. FIG. 4B shows the addition of a vertical partition line 43 which divides cell area into regions 42a and 42b. The cell areas in 42a and 42b are made equal by placement of the partition line 43 although the partition line 43 does not divide the rectangle 42 in half. In FIG. 4C, the partition line 43 has been assumed to divide the rectangle into equal halves and the cells A-D moved accordingly. Each half-rectangle 42a and 42b is also a rectangle and can be partitioned again. The selection of the position of the partition line (e.g., always moved one half of the pertinent distance) can be very arbitrary.
In the quadratic portioning process (as represented by the algorithm outlined in the pseudo code above), all cells start out in a rectangle which is the entire silicon chip or circuit board. This rectangle is partitioned into 2 rectangles. Each of these 2 rectangles gets partitioned into 2 rectangles, etc., until each rectangle contains a small number of cells (e.g., 20 cells). Usually, the process alternates between vertical and horizontal partition lines for each iteration of the main loop.
Quadratic partitioning placement runs relatively quickly on large circuits but its formulation is very inflexible. It can only optimize weighted sum-of-squared wire length. However, this metric is really the wrong objective. Namely, what is important in cell placement is wire routibility, circuit timing, power consumption, etc. Quadratic wire length is a poor approximation for these metrics. It is possible to approximate these metrics somewhat better by changing wire weights to emphasize wires that seem to be more important to make short. For example, wires on the critical timing path can have higher weight in order to improve the timing of the critical path. However, this second approach is still a poor approximation to the true objective function that placement should optimize. Also, the selection of the position of the partition line location is arbitrary and unfortunately can introduce non-optimal artifacts in the placement along the partition lines.
Accordingly, what is needed is a more effective coarse placement process. What is further needed is a coarse placement process that optimizes cell placement while emphasizing wire routability and circuit timing rather than emphasizing a quadratic minimization relationship. What is needed is a coarse placement process that addresses the problems of the prior art methods as discussed above. In view of the above needs, the present invention provides a novel coarse cell placement system for increasing the efficiency of an IC design process to thereby provide a faster, more cost effective and more accurate IC design process by producing chips with better wire routability, better timing and better power consumption. These and other advantages of the present invention not specifically mentioned above will become clear within discussions of the present invention presented herein.
A computer implemented process is described herein for automatic creation of integrated circuit (IC) geometry using a computer. In particular, the present invention includes a method to generate coarse or approximate placement of cells on a 2-dimensional silicon chip or circuit board. The coarse placer can also be used to automatically size cells, insert and size buffers, and aid in timing driven structuring of the placed circuit. The coarse placer can be used in conjunction with other automatic design tools such as a detailed placer and an automatic wire router.
The present invention includes a process that can be implemented as a computer program that uses general unconstrained non-linear optimization techniques to find a coarse placement of cells on a circuit board or silicon chip. A master objective function (MOF) is defined which evaluates the goodness of a particular cell placement. A non-linear optimization process finds an assignment of values to the function variables which minimizes the MOF. The MOF is chosen so that values of variables which minimize the MOF correspond to a good coarse placement.
In particular, the MOF is a weighted sum of functions which evaluate various metrics. An important metric for consideration is the density metric, which measures how well spread out the cells are in the placement. Other component functions are wire-length, which measures total linear wire-length, delay, which measures circuit timing, and power, which measures circuit power consumption. The barrier metric penalizes placements with cells outside the allowed placement region.
The present invention implements the MOF as a computer program subroutine in the preferred embodiment. A conjugate-gradient process utilizes both the MOF and its gradient to determine a next cell placement. In the preferred embodiment, the gradient of the MOF is also implemented as a computer program subroutine. The gradient is the vector of partial derivatives of the MOF with respect to all variables. The non-linear optimization process calls the MOF and gradient function subroutines and uses the results to minimize the MOF. A smoothing variable, alpha, is used to alter the MOF through multiple passes of the conjugate-gradient process where alpha is altered on each pass until the process terminates or convergence is reached.
In one implementation, the variables of the optimization are the (x and y) coordinates of all of the cells to represent 2-dimensional placement. The result is a placement of cells. In other embodiments, adding variables to represent other parameters of the circuit implementation combine additional optimizations with placement. One such additional variable within the present invention is cell size. Adding a variable for each cell size gives simultaneous placement and cell sizing. Adding a variable to each wire branch for buffer area gives simultaneous placement and buffer insertion. Adding a variable to each wire branch for buffer tree depth gives simultaneous placement and buffer tree balancing. Timing-driven structuring of fanout-free-trees can be modeled by adding a variable to each input of the fanout-free-tree to represent the depth of that input of the tree.
The present invention has the following advantages over the prior art methods. First, the present invention can solve large placement problems (e.g., 200,000 cells) in reasonable computer run time (e.g., 4 hours). Second, the present invention achieves better quality placements because it much more accurately models the metrics to be optimized than the prior art methods. Third, the present invention achieves better overall quality IC chip design because the present invention can simultaneously optimize placement, sizing, buffering, and timing-driven-structuring, and make appropriate tradeoffs between these different and often conflicting methods of circuit improvement. Lastly, the present invention achieves better quality because the present invention requires no partitioning step which typically introduces placement artifacts into prior art designs. The density function, described further below, advantageously ensures that cells are spread out evenly across the chip.