1. Field of the Invention
The present invention generally relates to the art of microelectronic integrated circuits, and more specifically to a system for placement of cells on integrated circuit chips.
2. Description of the Related Art
Microelectronic integrated circuits consist of a large number of electronic components which are fabricated by layering several different materials on a silicon base or wafer. The design of an integrated circuit transforms a circuit description into a geometric description which is known as a layout. A layout consists of a set of planar geometric shapes in the various layers of the silicon chip.
The process of converting the specifications of an electrical circuit into a layout is called the physical design. Physical design requires arranging elements, wires, and predefined cells on a fixed area, and the process can be tedious, time consuming, and prone to many errors due to tight tolerance requirements and the minuteness of the individual components.
Currently, the minimum geometric feature size of a component is on the order of 0.5 microns. Feature size may be reduced to 0.1 micron within several years. This small feature size allows fabrication of as many as 10 million transistors or 1 million gates of logic on a 25 millimeter by 25 millimeter chip. This feature size decrease/transistor increase trend is expected to continue, with even smaller feature geometries and more circuit elements on an integrated circuit. Larger chip sizes will allow far greater numbers of circuit elements.
Due to the large number of components and the exacting details required by the fabrication process, physical design is not practical without the aid of computers. As a result, most phases of physical design extensively use Computer Aided Design (CAD) tools, and many phases have already been partially or fully automated. Automation of the physical design process has increased the level of integration, reduced turn around time and enhanced chip performance.
The object of physical chip design is to determine an optimal arrangement of devices in a plane and to find an efficient interconnection or routing scheme between the devices to obtain the desired functionality. Since space on the chip surface is at a premium, algorithms must use the space very efficiently to lower costs and improve yield. The arrangement of individual cells in an integrated circuit chip is known as a cell placement.
Each microelectronic circuit device or cell includes a plurality of pins or terminals, each of which is connected to pins of other cells by a respective electrical interconnect wire network or net. A goal of the optimization process is to determine a cell placement such that all of the required interconnects can be made, and the total wirelength and interconnect congestion are minimized.
Prior art methods for achieving this goal comprise generating one or more initial placements, modifying the placements using optimization methodologies including genetic algorithms such as simulated evolution, force directed placement or simulated annealing, described hereinbelow, and comparing the resulting placements using a cost criteria.
Depending on the input, placement algorithms are classified into two major groups, constructive placement and iterative improvement methods. The input to the constructive placement algorithms consists of a set of blocks along with the netlist. The algorithm provides locations for the blocks. Iterative improvement algorithms start with an initial placement. These algorithms modify the initial placement in search of a better placement. The algorithms are applied in a recursive or an iterative manner until no further improvement is possible, or the solution is considered to be satisfactory based on a predetermined criteria.
Iterative algorithms can be divided into three general classifications: simulated annealing, simulated evolution and force directed placement. The simulated annealing algorithm simulates the annealing process that is used to temper metals. Simulated evolution simulates the biological process of evolution, while the force directed placement simulates a system of bodies attached by springs.
Assuming that a number N of cells are to be optimally arranged and routed on an integrated circuit chip, the number of different ways that the cells can be arranged on the chip, or the number of permutations, is equal to N| (N factorial). In the following description, each arrangement of cells will be referred to as a placement. In a practical integrated circuit chip, the number of cells can be hundreds of thousands or millions. Thus, the number of possible placements is extremely large.
Interactive algorithms function by generating large numbers of possible placements and comparing them in accordance with some criteria which is generally referred to as fitness. The fitness of a placement can be measured in a number of different ways, for example, overall chip size. A small size is associated with a high fitness and vice versa. Another measure of fitness is the total wire length of the integrated circuit. A high total wire length indicates low fitness and vice versa.
The relative desirability of various placement configurations can alternatively be expressed in terms of cost, which can be considered as the inverse of fitness, with high cost corresponding to low fitness and vice versa.
a. Simulated Annealing
Basic simulated annealing per se is well known in the art and has been successfully used in many phases of VLSI physical design such as circuit partitioning. Simulated annealing is used in placement as an iterative improvement algorithm. Given a placement configuration, a change to that configuration is made by moving a component or interchanging locations of two components. Such interchange can be alternatively expressed as transposition or swapping.
In the case of a simple pairwise interchange algorithm, it is possible that a configuration achieved has a cost higher than that of the optimum, but no single interchange can cause further cost reduction. In such a situation, the algorithm is trapped at a local optimum and cannot proceed further. This happens quite often when the algorithm is used in practical applications. Simulated annealing helps to avoid getting achieving and maintaining a local optima by occasionally accepting moves that result in a cost increase.
In simulated annealing, all moves that result in a decrease in cost are accepted. Moves that result in an increase in cost are accepted with a probability that decreases over time as the iterations proceed. The analogy to the actual annealing process is heightened with the use of a parameter called temperature T. This parameter controls the probability of accepting moves that result in increased cost.
More of such moves are accepted at higher values of temperature than at lower values. The algorithm starts with a very high value of temperature that gradually decreases so that moves that increase cost have a progressively lower probability of being accepted. Finally, the temperature reduces to a very low value which requires that only moves that reduce costs are to be accepted. In this way, the algorithm converges to an optimal or near optimal configuration.
In each stage, the placement is shuffled randomly to get a new placement. This random shuffling could be achieved by transposing a cell to a random location, a transposition of two cells, or any other move that can change the wire length or other cost criteria. After the shuffle, the change in cost is evaluated. If there is a decrease in cost, the configuration is accepted. Otherwise, the new configuration is accepted with a probability that depends on the temperature.
The temperature is then lowered using some function which, for example, could be exponential in nature. The process is stopped when the temperature is dropped to a certain level. A number of variations and improvements on the basic simulated annealing algorithm have been developed. An example is described in an article entitled "Timberwolf 3.2 A New Standard Cell Placement and Global Routing Package" by Carl Sechen, et al., IEEE 23rd Designed Automation Conference paper 26.1, pages 432 to 439.
b. Simulated Evolution
Simulated evolution, which is also known as the genetic algorithm, is analogous to the natural process of mutation of species as they evolve to better adapt to their environment. The algorithm starts with an initial set of placement configurations which is called the population. The initial placement can be generated randomly. The individuals in the population represent a feasible placement to the optimization problem and are actually represented by a string of symbols.
The symbols used in the solution string are called genes. A solution string made up of genes is called a chromosome. A schema is a set of genes that make up a partial solution. The simulated evolution or genetic algorithm is iterated, and each iteration is called a generation. During each iteration, the individual placements of the population are evaluated on the basis of fitness or cost. Two individual placements among the population are selected as parents, with probabilities based on their fitness. A better fitness for an individual placement increases the probability that the placement will be chosen.
The genetic operators are called crossover, mutation and inversion, which are analogous to their counterparts in the evolution process, are applied to the parents to combine genes from each parent to generate a new individual called the offspring or child. The offspring are evaluated, and a new generation is formed by including some of the parents and the offspring on the basis of their fitness in a manner such that the size of the population remains the same. As the tendency is to select high fitness individuals to generate offspring, and the weak individuals are deleted, the next generation tends to have individuals that have good fitness.
The fitness of the entire population improves with successive generations. Consequently, overall placement quality improves over iterations. At the same time, some low fitness individual cell placements are reproduced from previous generations to maintain diversity even though the probability of doing so is quite low. In this way, it is assured that the algorithm does not lock into a local optimum.
The first main operator of the genetic algorithm is crossover, which generates offspring by combining schemata of two individuals at a time. Combining schemata entails choosing a random cut point and generating the offspring by combining the left segment of one parent with the right segment of the other. However, after doing so, some cells may be duplicated while other cells are deleted. This problem will be described in detail below.
The amount of crossover is controlled by the crossover rate, which is defined as the ratio of the number of offspring produced by crossing in each generation to the population size. Crossover attempts to create offspring with fitness higher than either parent by combining the best genes from each.
Mutation creates incremental random changes. The most commonly used mutation is pairwise interchange or transposition. This is the process by which new genes that did not exist in the original generation, or have been lost, can be generated.
The mutation rate is defined as the ratio of the number of offspring produced by mutation in each generation to the population size. It must be carefully chosen because while it can introduce more useful genes, most mutations are harmful and reduce fitness. The primary application of mutation is to pull the algorithm out of local optima.
Inversion is an operator that changes the representation of a placement without actually changing the placement itself so that an offspring is more likely to inherit certain schema from one parent.
After the offspring are generated, individual placements for the next generation are chosen based on some criteria. Numerous selection criteria are available, such as total chip size and wire length as described above. In competitive selection, all the parents and offspring compete with each other, and the fittest placements are selected so that the population remains constant. In random selection, the placements for the next generation are randomly selected so that the population remains constant.
The latter criteria is often advantageous considering the fact that by selecting the fittest individuals, the population converges to individuals that share the same genes and the search may not converge to an optimum. However, if the individuals are chosen randomly there is no way to gain improvement from an older generation to a new generation. By combining both methods, stochastic selection chooses probabilities based on the fitness of each individual.
c. Force Directed Placement
Force directed placement exploits the similarity between the placement problem and the classical mechanics problem of a system of bodies attached to springs. In this method, the blocks connected to each other by nets are supposed to exert attractive forces on each other. The magnitude of this force is directly proportional to the distance between the blocks. Additional proportionality is achieved by connecting more "springs" between blocks that "talk" to each other more (volume, frequency, etc.) and fewer "springs" where less extensive communication occurs between each block.
According to Hooke's Law, the force exerted due to the stretching of the springs is proportional to the distance between the bodies connected to the spring. If the bodies are allowed to move freely, they would move in the direction of the force until the system achieved equilibrium. The same idea is used for placing the cells. The final configuration of the placement of cells is the one in which the system achieves a solution that is closest to actual equilibrium.
The problem of cell placement is compounded by external requirements specific to each individual integrated circuit chip. In conventional chip design, the positions of certain "unmovable" cells (external interconnect terminals or pads, large "megacells" etc.) are fixed a priori by the designer. Given those fixed positions, the rest of the cells are then placed on the chip. Since the unmovable cells and pads are located or placed before the placement for the rest of the cells of chip has been decided on, it is unlikely that the chosen positions will be optimal.
In this manner, a number of regions, which may have different sizes and shapes, are defined on the chip for placement of the rest of the cells.
It is desirable to assign individual microelectronic devices or cells to the regions, or "partition" the placement such that the total interconnect wirelength is minimized. However, methodologies for accomplishing this goal efficiently have not been proposed heretofore.
The general partitioning methodology is to hierarchically partition a large circuit into a group of smaller subcircuits until each subcircuit is small enough to be designed efficiently. Because the quality of the design may suffer due to the partitioning, the partitioning of a circuit requires care and precision.
One of the most common objectives of partitioning is to minimize the cutsize which is defined as a number of nets crossing a cut. Also the number of partitions often appears as a constraint with upper and lower bounds. At chip level, the number of partitions is determined, in part, by the capability of the placement algorithm.
The prior art accomplishes partitioning by means of a series of "bipartitioning" problems, in which a decision is made to assign a component to one of two regions. Each component is hierarchically bipartitioned until the desired number of components is achieved.
Numerous alternate methodologies for cell placement and assignment are known in the art. These include quadratic optimization as disclosed in an article entitled "GORDIAN: VLSI Placement by Quadratic Programming and Slicing Optimization", by J. Kleinhans et al, IEEE Trans. on CAD, 1991, pp. 356-365, and simulated annealing as described in an article entitled "A Loosely Coupled Parallel Algorithm for Standard Cell Placement", by W. Sun and C. Sechan, Proceedings of IEEE/ACM IC-CAD Conference, 1994, pp. 137-144.
These prior art methods cannot simultaneously solve the partitioning problem and the problem of placing partitions on the chip, and thus the applicability of such methods to physical design automation systems for integrated circuit chip design is limited.
More specifically, prior art methods do not provide any metric for specifying distances between cells based on netlist connections. An initial placement must be performed to establish physical locations for cells and thereby distances therebetween.
Also, prior art methods fix cells in clusters at the beginning of optimization, and do not provide any means for allowing cells to move between clusters as optimization proceeds. This can create areas of high routing congestion, which cannot be readily eliminated because cell movements between clusters which could relieve the congestion are not allowed.
In summary, the problem inherent in these prior cell placement methods is that repeated iterations generally do not tend to converge to a satisfactory relatively uniform overall cell placement for large numbers of cells. The aforementioned methods can take several days to place a large number of cells, and repeating these methods with different parameters or different initial arrangements may not necessarily provide improvements to cell placement. Typical methods for using these designs involve using a chosen method until a particular parameter, for example wire length, achieves a certain criteria or the method fails to achieve this criteria for a predetermined number of runs. The results are inherently non-optimal for other placement fitness measurements, having optimized the method based only on a single parameter. Further, results of these placement techniques frequently cannot be wired properly, or alternately, the design does not meet timing requirements. For example, with respect to simulated annealing, setting the temperature to different values may, under certain circumstances, improve placement, but efficient and uniform placement of the cells is not guaranteed.