1. Field of the Invention
The present invention generally relates to the art of microelectronic integrated circuit layout, and more specifically to the art of placement of cells on integrated circuit chips.
2. Description of Related Art
Microelectronic integrated circuits (IC) consist of a large number of electronic components which are fabricated by layering several different materials on a silicon base or wafer. The design of an integrated circuit transforms a circuit description into a geometric description which is known as a layout. A layout consists of a set of planar geometric shapes in the various layers of the silicon chip.
The process of converting the specifications of an electrical circuit into a layout is called physical design. Physical design requires arranging elements, wires, and predefined cells on a fixed area. The process can be tedious, time consuming, and prone to many errors due to tight tolerance requirements and the minuteness of the individual components, or cells.
Currently, the minimum geometric feature size of a component is on the order of 0.5 microns. Feature size may be reduced to 0.1 micron within the next several years. The current small feature size allows fabrication of as many as 10 million transistors or approximately 1 million gates of logic on a 25 millimeter by 25 millimeter chip. This feature-size-decrease/transistor-increase trend is expected to continue, with even smaller feature geometries and more circuit elements on an integrated circuit. Larger chip sizes will allow far greater numbers of circuit elements.
Due to the large number of components and the exacting details required by the fabrication process, physical design is not practical without the aid of computers. As a result, most phases of physical design use extensively Computer Aided Design (CAD) tools. Automation of the physical design process has increased the level of integration, reduced turn around time and enhanced chip performance.
The object of physical chip design is to determine an optimal arrangement of devices in a plane and to find an efficient interconnection or routing scheme between the devices that results in the desired functionality. Since space on the chip surface is at a premium, algorithms must use the space very efficiently to lower costs and improve yield. The arrangement of individual cells in an integrated circuit chip is known as a cell placement.
Each microelectronic circuit device or cell includes a plurality of pins or terminals, each of which is connected to pins of other cells by a respective electrical interconnection wire network, or net. A purpose of the optimization process used in the physical design stage is to determine a cell placement such that all of the required interconnections can be made, but total wirelength and interconnection congestion are minimized.
Typical methods for achieving this goal include generating one or more initial placements and modifying the placement or placements using optimization methodologies such as simulated annealing, genetic algorithms (i.e. simulated evolution), and force directed placement. Each of these techniques involve iterative applications of the respective algorithms to arrive at an estimate of the optimal arrangement of the cells.
Depending on the input, placement algorithms are classified into two major groups, constructive placement algorithms and iterative improvement algorithms. The input to the constructive placement algorithms consists of a set of blocks along with the netlist. The algorithm provides locations for the blocks. Iterative improvement algorithms start with an initial placement. These algorithms modify the initial placement in search of a better placement. The algorithms are applied in a recursive or an iterative manner until no further improvement is possible, or the solution is considered to be satisfactory based on certain predetermined criteria.
Iterative algorithms function by generating large numbers of possible placements and comparing them in accordance with some criteria which is generally referred to as fitness. The fitness of a placement can be measured in a number of different ways, for example, overall chip size. A small size is associated with a high fitness and a large size is associated with a low fitness. Another measure of fitness is the total wire length of the integrated circuit. A high total wire length indicates low fitness and a low total wire length, on the other hand, indicates high fitness. One cell placement optimization system is described in U.S. patent application Ser. No. 08/672,725. Applicants hereby incorporate the specification, including the drawings, of said application herein as though set forth in full.
The relative desirability of various placement configurations can alternatively be expressed in terms of cost. Cost can be considered as the inverse of fitness, with high cost corresponding to low fitness and, similarly, lost cost corresponding to high fitness.
Iterative algorithms can be divided into three general classifications: simulated annealing, simulated evolution and force directed placement. The simulated annealing algorithm simulates the annealing process that is used to temper metals. Simulated evolution simulates the biological process of evolution, while the force directed placement simulates a system of bodies attached by springs.
Assuming that a number N of cells are to be optimally arranged and routed on an integrated circuit chip, the number of different ways that the cells can be arranged on the chip, or the number of permutations, is equal to N| (N factorial). In the following description, each arrangement of cells will be referred to as a placement. In a practical integrated circuit chip, the number of cells can be hundreds of thousands or millions. Thus, the number of possible placements is extremely large.
Because of the large number of possible placements, even computerized implementations of the placement algorithms discussed above can take many days. In addition, the placement algorithm may need to be repeated with different parameters or different initial arrangements to improve the results.
To reduce the time required to place optimally the cells, multiple processors have been used to speed up the process. In such implementations, multiple processors operate simultaneously in different regions of the chip to place the cells on the integrated chip. However, such prior efforts to reduce the cell placement time by parallel processing of placement methods have been impeded by crossover net conflicts, delays arising from inter-processor communication requirements, and uneven distribution of work among the multiple processors.
Referring to FIG. 1, a prior art technique of parallelizing cell placement algorithms is illustrated by the flowchart 10. The prior art methods have parallelized cell placement by first preplacing the cells on the chip 12 and dividing the chip into regions 14 each of which are assigned to a processor 16. The same cell placement algorithm is simultaneously executed by the multiple processors, each processor placing the cells located on its assigned regions of the chip 18. Each of the processors controls the cells located in its assigned regions. Then, each of the multiple processors analyze the placement of each of the cells located within its assigned regions to improve the overall placement of the cells 18. Several problems arise from the prior art technique.
The problems associated with the prior art parallelization technique can be illustrated using FIG. 2. FIG. 2 illustrates a grossly simplified IC with four nets 7, 9, 11, and 13 and four regions 8a, 8b, 8c, and 8d, each of which has been assigned to a processor.
The first problem is the crossover net problem. If the regions are divided such that crossover nets are created, then the effectiveness of the parallel processing technique is reduced. This is because none of the processors which share the crossover nets can accurately calculate the position of the (which is always the basis for the decision about the cell move) because the other processor may move its cell during the calculation. Naturally, as the number of processors increases, the number of crossover nets increases, aggravating the problem. A large number of crossover nets can be fatal for the convergence of cell placement algorithms. For example, in FIG. 2, nets 9, 11 and 13 are the crossover nets. Some cells of net 9 are processed by the processor assigned to region 8a while others are processed by the processor assigned to region 8c. Likewise, the cells of nets 11 and 13 are placed by processors assigned to regions 8a and 8b, and 8b and 8d, respectively.
Second, cell movements from one region (or processor) to another creates communications overhead which may negate the advantages of multiple processor cell placement technique. Each time a cell is moved from one region to another, the processor moving the cell from its assigned region must communicate with the processor receiving the cell to its assigned region. The communication requirement complicates the implementation of cell placement algorithms and slows down both of the communicating processors. As the number of processors, the number of cells, or the number of required cell moves increase, the communication overhead increases. In particular, the performance of the parallel processing technique is especially poor if the spring density levelization method is used as the cell placement algorithm because the algorithm tends to make global cell moves.
Third, to minimize crossover nets and communications overheads, the prior art parallelization techniques typically require a "good" preplacement of the cells on the chip. That is, in order to operate effectively, the prior art methods require the nets to be within a single region and the cells of the nets to be "close" to each other. The best way to achieve this is to increase the region size and decrease the number of processors running in parallel. However, the increase in the region size and the decrease in the number of parallel processors defeat the purpose of parallelizing the cell placement algorithm. Moreover, even with such preplacement of cells, there are generally still many crossover nets.
In order to avoid the problems associated with crossover nets, regions have to be made larger. Use of large regions has the disadvantage in that it limits the number of processors that can be used. In fact, if the entire integrated chip is defined as one region, and only one processor is assigned to place the cells of the chip, then there would be no crossover net problems or communications overhead; but, there also is no parallel processing, and the cell placement becomes a sequential process. Finally, the prior art technique of assigning regions of the IC to each of the multiple processors lead to the problem of unbalanced work load. Because each of the regions may contain varying number of nets, cells, or cells requiring further movements, it is difficult to assign regions to the processors so as to assign equal amount of work to each of the processors. Consequently, some processors finish the placement of the cells of its assigned regions more quickly than other processors, reducing the effectiveness of parallelization of the placement algorithm.
In summary, because of the ever-increasing number of cells on an integrated chips (currently at millions of cells on a chip), and the resulting increase in the number of possible placements of the cells on the chip, a computer is used to find an optimal layout of the cells on the chip. Even with the aid of computers, existing methods can take several days to place a large number of cells, and these methods may need to be repeated with different parameters or different initial arrangements. To decrease the time required to place the cells on an integrated circuit chip, multiple processors have been used to perform the placement of the cells. However, the use of multiple processors has led to crossover net conflicts, inter-processor communication problems, cell preplacement requirements, and uneven distribution of work problems, negating the advantages of using the multiple processors.