1. Field of the Invention
The present invention generally relates to the fabrication and design of semiconductor chips and integrated circuits, more specifically to a method of designing the physical layout (placement) of logic cells in an integrated circuit and the wiring (routing) of those cells, and particularly to the use of placement algorithms in designing circuit layouts.
2. Description of the Related Art
Integrated circuits are used for a wide variety of electronic applications, from simple devices such as wristwatches, to the most complex computer systems. A microelectronic integrated circuit (IC) chip can generally be thought of as a collection of logic cells with electrical interconnections between the cells, formed on a semiconductor substrate (e.g., silicon). An IC may include a very large number of cells and require complicated connections between the cells. A cell is a group of one or more circuit elements such as transistors, capacitors, resistors, inductors, and other basic circuit elements grouped to perform a logic function. Cell types include, for example, core cells, scan cells and input/output (I/O) cells. Each of the cells of an IC may have one or more pins, each of which in turn may be connected to one or more other pins of the IC by wires. The wires connecting the pins of the IC are also formed on the surface of the chip. For more complex designs, there are typically at least four distinct layers of conducting media available for routing, such as a polysilicon layer and three metal layers (metal-1, metal-2, and metal-3). The polysilicon layer, metal-1, metal-2, and metal-3 are all used for vertical and/or horizontal routing.
An IC chip is fabricated by first conceiving the logical circuit description, and then converting that logical description into a physical description, or geometric layout. This process is usually carried out using a “netlist,” which is a record of all of the nets, or interconnections, between the cell pins. A layout typically consists of a set of planar geometric shapes in several layers. The layout is then checked to ensure that it meets all of the design requirements, particularly timing requirements. The result is a set of design files known as an intermediate form that describes the layout. The design files are then converted into pattern generator files that are used to produce patterns called masks by an optical or electron beam pattern generator. During fabrication, these masks are used to pattern a silicon wafer using a sequence of photolithographic steps. The component formation requires very exacting details about geometric patterns and separation between them. The process of converting the specifications of an electrical circuit into a layout is called the physical design.
The present invention is directed to an improved method for designing the physical layout (placement) and wiring (routing) of cells. Cell placement in semiconductor fabrication involves a determination of where particular cells should optimally (or near-optimally) be located on the surface of a integrated circuit device. Due to the large number of components and the details required by the fabrication process for very large scale integrated (VLSI) devices, physical design is not practical without the aid of computers. As a result, most phases of physical design extensively use computer-aided design (CAD) tools, and many phases have already been partially or fully automated. Automation of the physical design process has increased the level of integration, reduced turn around time and enhanced chip performance. Several different programming languages have been created for electronic design automation (EDA), including Verilog, VHDL and TDML.
Placement algorithms are typically based on either a simulated annealing, top-down cut-based partitioning, or analytical paradigm (or some combination thereof). Recent years have seen the emergence of several new academic placement tools, especially in the top-down partitioning and analytical domains. The advent of multilevel partitioning as a fast and extremely effective algorithm for min-cut partitioning has helped spawn a new generation of top-down cut-based placers. A placer in this class partitions the cells into either two (bisection) or four (quadrisection) regions of the chip, then recursively partitions each region until a global coarse placement is achieved.
FIG. 1 illustrates a typical placement process according to the prior art. First, a plurality of the logic cells 2 are placed using the entire available region of the IC 4 as shown in the first layout of FIG. 1. After initial placement, the chip is partitioned, in this case, via quadrisection, to create four new regions. At the beginning of the partitioning phase some cells may overlap the partition boundaries as seen in the second layout of FIG. 1. The cell locations are then readjusted to assign each cell to a given region as shown in the final layout of FIG. 1. The process then repeats iteratively for each region, until the number of cells in a given region (bin) reaches some preassigned value, e.g., one. While FIG. 1 illustrates the placement of only seven cells, the number of cells in a typical IC can be in the hundreds of thousands, and there may be dozens of iterations of placement and partitioning. Analytical placers may allow cells to temporarily overlap in a design. Legalization is achieved by removing overlaps via either partitioning or by introducing additional forces and/or constraints to generate a new optimization problem. The classic analytical placers, PROUD and GORDIAN, both iteratively use bipartitioning techniques to remove overlaps. Eisenmann's force-based placer uses additional forces besides the well-known wire length dependent forces to reduce cell overlaps and to consider the placement area.
Analytical placers optimally solve a relaxed placement formulation, such as minimizing total quadratic wire length. Quadratic placers thus attempt to minimize the sum of squared wire-lengths of a design according to the formula:Φ(x,y)=Σ(xi−xj)2+(yi−yj)2in both the horizontal and vertical directions. Since x and y are independent of each other, they can be solved for separately. It can be shown that this optimization is equivalent to minimizing Φ(x) according to the formula:Φ(x)=½xTAx−bTx+cwhere A is a matrix, x and b are vectors, and c is a scalar constant. The y component is solved analogously. Setting the derivative of this function to zero obtains the minimum value:dΦ(x)/dx=0.Using the equivalent function, this last equation simplifies to the linear systemAx=b.The solution to this linear system determines the initial locations of objects in the given placement region. This linear system can be solved using various numerical optimization techniques. Two popular techniques are known as conjugate gradient (CG) and successive over-relaxation (SOR). The PROUD placer uses the SOR technique, while the GORDIAN placer employs the CG algorithm. In general, CG is known to be more computationally efficient than SOR with a better convergence rate, but CG takes more central processing unit (CPU) time per iteration.
As device technology enters the new deep sub-micron (DSM) era, the role of placement has become more important, and more difficult. The complexity of IC designs in the DSM realm has been growing significantly mainly due to reduced device sizes. It is estimated that the number of transistors per chip will be over 1.6 billion by the year 2016. The current maximum number of objects readily handled by existing placement tools is in the range of tens of millions. While these existing placement tools could conceivably be used to find acceptable solutions with more than 10 million objects, it would likely take an unbearably long time to arrive at those solutions. Thus, current placement tools lack the scalability necessary to handle the ever-increasing number of objects in IC designs. Unfortunately, performance (i.e., quality assurance) and scalability contradict each other. Obtaining higher quality placement solutions requires more CPU time.
One approach to simplifying placement is to group objects into clusters, effectively reducing the overall number of objects which can then be placed with less computation time. In one form of clustering referred to as edge coarsening (EC) objects are visited in a random order and, for each object, all unmatched adjacent objects are considered and the one that is connected with the largest “weight” is matched for clustering. With EC, a hyperedge of k pins has a weight of 1/(k−1). In a modified EC approach known as first choice (FC), the cluster size is limited by discontinuing coarsening when the new coarsened number of objects a certain threshold. FC also allows an object to be clustered multiple times during a single iteration while EC forces each object to be clustered with another unmatched object during the current iteration. This difference results in more balanced clustering with EC. Another form of clustering transforms a given hypergraph into a graph (wherein every net has only 2 pins) by decomposing pins into a “clique” with an edge weight of 1/(k−1). The clustering algorithm then ranks edges according to a connectivity-based metric using a heap. The algorithm proceeds by clustering the highest ranking edge if its size does not exceed a certain size limit, and then updating the circuit netlist and the heap. The area of a cluster can be included in the objective function for cluster size balancing. Fine grain clustering (i.e., clusters of small sizes) can be used to improve placement runtime.
There are, however, several disadvantages to clustering. Hypergraph-to-graph transformation (decomposing a hyperedge into a clique) causes a discrepancy in the edge weights once any two objects that belong to same hyperedge are clustered, and leads to an unreasonably large size heap in heap-based implementations. Pass-based clustering methods such as EC that do not allow object revisiting lead to suboptimal choices since they are likely to forbid an object from clustering to its best neighbor. Finally, non-heap based implementations such as FC lead to suboptimal clustering choices since a clustered pair of objects might not be the best overall grouping. It would, therefore, be desirable to devise an improved method of clustering VLSI circuits to provide more scalable placement algorithms. It would be further advantageous if the new clustering technique could achieve better runtime characteristics while minimizing or reducing any degradation in the quality of the solutions.