1. Field of the Invention
The present invention relates in general to computer software for automatically generating an integrated circuit layout.
2. Description of Related Art
An IC designer usually begins the IC design process by producing a register transfer language (RTL) xe2x80x9cnetlistxe2x80x9d describing the IC circuit only in terms of the logic it carries out. For example a high level netlist may describe a cell connected to three node A, B and C using the equation C=A*B. This equation indicates only that the cell generates an output signal at node C that is the logical AND of signals appearing at nodes A and B. To test the logic of the circuit described by the netlist, the designer supplies the netlist and a xe2x80x9ctestbenchxe2x80x9d file as inputs to a circuit simulator. The circuit simulator then simulates the behavior of the circuit described by the netlist in response to a set of input signals described by the testbench file and produces output data describing the time-dependent behavior of signals at various nodes of the circuit. Since at this level the netlist models only the circuit logic, the simulator is only concerned with simulating circuit logic and does not attempt to simulate circuit timing.
Having used the simulator to verify the logic of the circuit, the designer typically uses a syntheses tools to create a xe2x80x9cgate levelxe2x80x9d netlist that models the circuit as a set of interconnected circuit components (cells), wherein each cell is described by an entry in a cell library. IC components described as library cells may range from individual transistors and small components formed by several transistors such as logic gates up to very large components such as computer processors and memories. The cell library describes not only the logic performed by the cells, but also the time-dependent behavior of the cells. The boolean models of cell behavior are replaced with mathematical models that more accurately reflect the time-dependent behavior of the cells. For example, instead of modeling an AND gate by a simple boolean function C=A*B, a gate level netlist will model the AND gate with a mathematical expression having time as a variable and which describes the gate""s input and output signals as analog voltages that change in magnitude over time in response to changes in input signal magnitudes. This more detailed netlist model of the circuit enables the circuit simulator to more accurately verify not only the circuit""s logic but also the time-dependent behavior of the circuit. Thus by supplying a gate level netlist as input to a simulator, the designer can use the simulator to determine not only whether the AND gate carries out the required AND function, but also whether it does so quickly enough to meet various timing constraints for the circuit. However since the design at this point does not accurately model signal routing paths between the cells, the simulator output does not accurately take into account signal path delays between the cells.
After using the simulator to verify the time-dependent behavior of the circuit described by the gate level netlist, the circuit designer employs an automated placement and routing (PandR) tool to convert the gate level netlist into an IC layout describing how and where each cell is to be formed in the IC substrate and describing the signal routing paths within the IC that are to interconnect the cells. A typical placement and routing tool uses an algorithm which iteratively moves cells about on the substrate looking for a placement solution in which all cells fit within the substrate area allocated for the placement, that allows room for the routing paths needed to properly interconnect the cells and that satisfies various timing constraints on the circuit.
Once a PandR tool has created an IC layout satisfying all constraints, the designer may use a conventional netlist compiler to convert the layout back into another xe2x80x9clayout levelxe2x80x9d netlist that accurately models the time-dependent behavior not only of the cells forming the IC but also of the routing structures that interconnect the cells. The designer may then again use a circuit simulator and other tools to verify the behavior of the circuit before sending the completed IC layout to an IC manufacturer.
Placement and Routing Tools
As illustrated in FIG. 1 a placement and routing tool 10 converts a gate level netlist design of an integrated circuit into an IC layout satisfying various timing and spatial constraints supplied as input to the tool. A cell library 12 tells PandR tool 10 how to layout each cell referenced by the netlist and the PandR tool 10 determines an appropriate position within an IC substrate for each cell. PandR tool 10 also designs the routing structures that interconnect the cells. A netlist compiler 13 may then convert the IC layout back into a layout level netlist for use by simulation and verification tools 15.
One way PandR tool 10 could determine an appropriate cell layout would be to randomly choose cell placements until it finds a placement that permits the cells to be appropriately interconnected in a manner that satisfies the various timing and other constraints. However for large ICs, it can take too long for a PandR tool to find a suitable placement by randomly generating and testing various cell placements to see if they can be appropriately routed. However various algorithms have been developed that reduce the amount of time a PandR tool needs to find acceptable cell placement.
FIG. 2 is a flow chart illustrating the process carried out by a typical PandR tool when generating an IC layout. The PandR tool makes use of a widely used placement and routing procedure making use of the well-known xe2x80x9cmin-cutxe2x80x9d algorithm (steps 40-43) for generating a cell placement in an IC substrate. The basic approach of the min-cut algorithm is to progressively divide the substrate area into smaller and smaller partitions and to allocate cells to each partition after each division in an attempt to minimize the number of connections between cells that must pass between partitions. This system helps to minimize the lengths of signal paths between cells by attempting to position highly interconnected cells near one another. Keeping signal paths short improves the chance that the PandR tool will be able to establish suitable routing paths between the cells because the routing paths require less space. Also since short paths have low signal path delays, keeping signal paths short improves the chances that the IC layout will satisfy various timing constraints.
FIG. 3 is a pictorial illustration of the min-cut process. Although ICs typically have thousands or millions of cells, for simplicity the example of FIG. 3 assumes the IC design includes only 26 cells A-Z that are to be placed fit within a substrate area 14. The first step of the process is to divide the substrate into two partitions 16 and 18 and randomly assign cells A-Z to the two partitions, thereby creating an initial xe2x80x9cseed partitioningxe2x80x9d 20. The placement algorithm then tries to optimize the manner in which cells are allocated to the two partitions 16 and 18 by moving cells from partition-to-partition trying to find a placement that minimizes the number of cell-to-cell connections that cross between the two partition. For large ICs it would take too long to try all possible placement, so in many systems each cell is moved only once between partitions.
After attempting to optimize the placement of cells between the two initial partitions 16 and 18, the algorithm divides partition 16 into two partitions 21 and 22 and divides partition 18 into two partitions 23 and 24. It then tries to minimize the number of connections that cross partition lines between partitions 21 and 22 and by moving cells between partitions 21 and 22. The system will also try to minimize the number of connections crossing partition lines between partitions 23 and 24 by moving cells between them. Since partitions 21 and 22 divide partition 16, the system is free to move any cell from partition 22 to partition 21. However it will not try moving any cell from partition 22 to partition 24 since partitions 22 and 24 are not derived from the same parent partition.
After optimizing the cell placement within partitions 21-24, the system divides each partition 21-24 in half to produce a set of eight partitions 31-38 and repeats the optimization process. Note that the system may move cell A from partition 31 to partition 32 because partitions 31 and 32 are derived from the same parent partition 21. However the system is not free to move cells from partition 31 to partition 33 because the two partitions have a different parent partitions. The iterative process of dividing and optimization continues until the number of cells per partition falls below a predetermined limit.
Steps 40-43 of FIG. 2 depict the min-cut process illustrated in FIG. 3. The PandR tool establishes the seed partition at step 40, optimizes the partition at step 41, and then (step 42) determines whether the number of cells per partition has fallen below the predetermined lower limit. If not, the system partitions the substrate again (step 43) and repeats the optimization step 41. The tool iteratively repeats steps 41-43 until partitions reach their lower size limit at step 42.
After using the min-cut algorithm to place the cells, the PandR tool tries to lay out signal paths for interconnecting the cells (step 44). If the PandR tool is able to successfully lay out all necessary signal paths (step 45) based on the layout developed at steps 40-43, then the layout is analyzed (step 46) to determine whether it meets all timing and other constraints. If all constraints are satisfied (step 47) the placement and routing process ends. However if a successful routing plan could not be developed (step 46), or if the IC layout does not satisfy all timing and other constraints (step 47), then the process starts over again at step 40 by choosing another seed partition. Since the IC layout to be routed at step 44 is a direct result of the seed partition randomly selected at step 40, different seed partitions selected at step 40 are likely to result in a different IC placement and routing plans. Thus the PandR algorithm searches for an acceptable IC layout by randomly choosing a succession of seed partitions and testing whether each seed partition results in a placement that can be successfully routed and which meets various circuit timing and other criteria. While the min-cut algorithm randomly chooses seed partitions, the iterative partitioning and optimization process increases the likelihood that the randomly chosen seed partition will result in an acceptable layout. The min-cut approach will typically find a suitable layout more quickly than a system that randomly chooses placement plans to be routed. However it still can be time-consuming, particularly when the IC includes a large number of cells.
Clustering
In general the more cells an IC includes, the longer it takes a placement algorithm to produce a placement plan based on a seed partition. Hence if a designer can reduce the number of cells the algorithm must place, he or she can reduce the time the placement algorithm needs to generate each placement alternative.
FIG. 4 illustrates a PandR process employing an improved min-cut placement algorithm described by the paper entitled xe2x80x9cMultilevel Circuit Partitioningxe2x80x9d by Alpert et. al, published in 1997 by the Design Automation Conference. The algorithm describes an improved min-cut algorithm requiring less time to produce a placement plan from a seed partition. The algorithm first organizes the IC""s cells into a set of xe2x80x9cclustersxe2x80x9d (step 50), wherein each cluster includes one or more cells. Each cluster including more than one cell is then redefined at step 50 as a single cell having a particular area and shape based on the size and shape of its constituent cells. The algorithm is biased towards grouping cells that are highly interconnected with one another together into the same cluster, and is also biased toward grouping the smallest cells into clusters.
The number of clusters is selected to be only about 10% smaller then the number of cells, so the result of the clustering process carried out at step 50 causes only a 10% reduction in the number of cells. If the number of cells is not less than a predetermined limit (step 51) then the algorithm repeats the clustering process (step 50) to further reduce the number of cells. The algorithm continues to loop through steps 50 and 51 until the number of cells falls below the threshold level.
FIG. 5 graphically illustrates the clustering process. A group of cells A-Z are initially assigned (xe2x80x9cmatchedxe2x80x9d) to a set of clusters 70. A modified circuit design is then xe2x80x9cinducedxe2x80x9d by redefining clusters 70 as a new, smaller set of cells Axe2x80x2-Nxe2x80x2. The matching process is then repeated to produce a new set of clusters 72, and the circuit design is again modified to redefine clusters 72 as a set of cells Axe2x80x3-Gxe2x80x3.
Referring again to FIG. 4, when the number of cells falls below the threshold level at step 51, the algorithm creates a seed partition (step 56) and then optimizes placement of cells between the partitions (step 57) as in a convention min-cut algorithm. The PandR system can carry out the optimization process (step 57) relatively quickly because the xe2x80x9cclusteredxe2x80x9d IC design has substantially fewer cells than the original (non-clustered) IC design. After optimizing the seed placement at step 57, the system determines (step 58) whether the design is flat (i.e. unclustered). Since at this point the design is still clustered, the algorithm moves to step 59 which rolls back the last iterative clustering performed at step 50, thereby slightly increasing the number of cells in the design. The current partitions are then divided to form new partitions (step 60) and the cell placement within the new partitions is then optimized at step 57. The design is again processed at step 59 to remove the clustering produced by the second-to-last iteration of step 50. The current partitions are then divided once again at step 60 and optimized at step 57. Thus with each pass through steps 57-60, the algorithm not only divides the substrate into smaller partitions as in conventional min-cut algorithms, it also rolls back the clustering carried out at step 50 by one level, thereby increasing the number of cells in the design with each pass.
The placement process ends at step 58 when the design has returned to its original flat, unclustered, state. A routing plan is then generated (step 62), and if the routing plan successfully links all of the cells (step 63), the layout is analyzed to determine whether it satisfies all constraints (step 64). If not, or if the system is unable to successfully route the layout (step 63), then the placement and routing process (steps 56-66) is repeated.
While the clustering and unclustering steps 50 and 59 require processing time, cell clustering substantially reduces the time needed for each pass through the optimization step 57 and the processing time saved at step 57 more than offsets the processing time required to perform steps 50 and 59. Thus the clustering process improves the speed with which the PandR tool is able to generate layouts.
As mentioned above, the system is biased toward including cells that are highly interconnected with one another into the same cluster at step 50. This is beneficial because it anticipates what the min-cut placement process tries to doxe2x80x94keep highly interconnected cells close together. Clustering only the most highly interconnected cells together therefore maximizes the likelihood that cells would end up in the same partition after each pass of optimization step 57 regardless of whether they had been grouped into clusters. Thus while clustering cells increases the speed of the min-cut placement process, it does not significantly affect its outcome.
An IC designer often creates RTL and gate level netlists that are hierarchical in nature, grouping various cells into modules which may themselves be grouped into progressively higher level modules. For example a computer processor module may include many submodules such as registers, instruction decoders, cache memories and the like, which in turn may be formed by lower level modules or individual cells. RTL and gate level netlists for large IC designs can have many hierarchical levels.
However conventional layout tools ignore the hierarchical nature of netlists. Since they are interested only in placing and routing individual cells, they typically xe2x80x9cflattenxe2x80x9d the netlist to a single level, so that it describes the design only in terms of a collection of interconnected cells without any reference to a module hierarchy.
Thus when assigning cells to clusters at step 50, the algorithm of FIG. 4 determines which cells are highly interconnected simply by counting the number of connections between the cells as indicated by the netlist. The fact that the two cells may or may not be a part of the same module is irrelevant to the decision. Of course the system will frequently group cells of the same module into the same cluster because cells forming the same module tend to be highly interconnected with one another. However cells belonging to different modules can often be highly interconnected, such as for example cells forming module input/output terminals. Thus the algorithm of FIG. 4 can also often group cells of different modules into the same cluster. This is not problematic in the context of the layout system of FIG. 4 where the notion of modular hierarchy is irrelevant to the layout. However grouping cells of different modules into the same cluster prior to generating an IC layout can be a problem when the layout tool does take into account the hierarchical nature of the design.
Design Partitioning
As ICs become progressively larger computers carrying out the automated placement and routing phase of the design process require progressively longer amounts of time to lay out ICs. As mentioned above, one way to reduce processing time when laying out an IC is to employ clustering. Another way a designer can reduce the time required to lay out an IC is to divide the circuit design into two or more partitions and to separately lay out each partition. (Note that in this context the word xe2x80x9cpartitionxe2x80x9d applies to a portion of the IC design, whereas in the context of the above-described min-cut placement process, the word xe2x80x9cpartitionxe2x80x9d applies to a portion of the substrate area in which cells of an IC are placed.)
Since the time required to lay out an IC increases geometrically with the number of cells forming the IC, it can be much faster for a PandR tool to successively layout M partitions of an IC having N cells each than to layout the entire IC having M*N cells. Further speed improvements can be had by using a separate layout tools to concurrently lay out the partitions.
However this approach is problematic because the designer may have difficulty accurately estimating an appropriate size, shape and position of the substrate area allocated to each partition and may have difficulty allocating timing constraints for the partitions. When a designer imposes a timing constraint on an IC design, the constraint typically specifies that a signal path formed by a set of cells connected between two circuit nodes A and B may have a signal path delay no greater than some maximum limit. The placement and routing tool tries to lay out the IC so that it satisfies all timing constraints. However when the design is divided prior to placement and routing with node A appearing in one partition and node B appearing in another partition, then the designer must also divide the constraint among the partitions, allocating portions of the maximum allowable signal path delay to portions of the signal path residing in and between the partitions. It can be difficult and time-consuming for the designer to determine how much of that maximum signal path delay to allocate to each partition.
Thus what is needed is a system for automatically partitioning a hierarchical netlist description of a circuit in a way that enables PandR tools to quickly and efficiently produce layouts for the design partitions satisfying circuit timing and other constraints. Moreover it would be helpful to combine partitioning with clustering to obtain the speed benefits of both techniques. However, referring again to FIG. 4, when assigning cells to clusters at step 50, the prior art clustering algorithm of FIG. 4 determines which cells are highly interconnected simply by counting the number of connections between the cells as indicated by the netlist. The fact that the two cells may or may not be a part of the same module is irrelevant to the decision. The system will tend to group cells of the same module into the same cluster because cells forming the same module tend to be highly interconnected with one another. However since cells belonging to different modules can also be highly interconnected, cells belonging to different modules can be assigned to different clusters. This is not problematic in the context of the layout system of FIG. 4 where the notion of modular hierarchy is irrelevant to the layout. However in a system that partitions designs along modular lines, grouping cells of different modules into the same cluster prior to partitioning the design causes a problem.
Thus what is needed is a system for converting a hierarchical netlist description of the IC into an IC layout that uses both partitioning and a form of clustering to speed up the layout process and which automatically allocates timing constraints among the partitions in an appropriate manner.
An integrated circuit (IC) layout system in accordance with the invention initially modifies a gate level netlist describing an IC as a hierarchy of circuit modules to combine clusters of cells included within selected modules so that they form a smaller number of larger cells. Only modules comprising a number of cells falling within a predetermined first cell count range are subjected to clustering, and the average number of cells included in each cluster is selected so that the total number of cells in the design after clustering falls within a predetermined second cell count range. Thus regardless of the number of cells included in the original netlist, the number of cells included in the modified xe2x80x9cclusterizedxe2x80x9d netlist remains the about same. Since the time required to perform an IC layout depends to a large extent on the number of cells in the IC design, clustering the design in this manner renders the subsequent layout process xe2x80x9cscalablexe2x80x9d: the complexity of the placement routing process remains substantially the same regardless of the size (number of cells) forming the IC because the clustering process reduces all large IC designs to approximately the same number of cells.
The clustering process respects the hierarchical nature of the design; it does not blur the lines between modules subjected to clustering by incorporating cells of more than one module into the same cluster. Thus after employing clustering to reduce the complexity of the netlist, the system is able to divide the netlist along modular lines to produce two or more netlists, each describing a separate partition of the IC design. The system then independently lays out each partition and thereafter combines them to form a full IC layout.
The designer specifies which modules are to be included in each partition, and the layout system automatically produces a partition plan including a floor plan allocating an area of semiconductor substrate to each partition and a pin assignment plan indicating points along the boundary of each partition area at which input/output signals cross.
To create the partition plan, the system first generates a trial layout of the IC that the modified netlist describes. Based on the shape and position of overlapping areas various modules occupy in the trial layout, the system estimates the shape and position of a substrate area each such module would require in a layout where module areas did not overlap. The system then creates a floor plan allocating substrate space to each partition based on the estimated space requirement of each module assigned to that partition. It also creates a pin assignment plan, selecting points at which signal paths cross partition boundaries based on the positions of the signal paths in the trial layout. The system also creates a timing budget allocating signal path timing constraints among the partitions based on an timing analysis of signal paths delays in the trial layout.
Thereafter the system divides the netlist to create a separate netlist for each partition and then independently lays out each IC partition so that it satisfies that partition""s spatial and timing constraints as indicated by the partition plan and timing budget. The system then assembles the partition layouts into a complete top-level IC layout.
Since cells are clustered in a manner that respects module boundaries, the system can cluster the cells before partitioning the design. Thus by clustering cells, the system not only reduces the time it needs to generate the partition layouts, it also reduces the time it needs to develop a partition plan because it reduces the time it needs to generate the trial layout providing information needed to develop the partition plan.
It is accordingly an object of the invention to provide a system for clusterizing a netlist description of an IC design in a manner that respects modular boundaries.
It is another object of the invention to provide a system for generating an IC layout that makes use of both clustering and design partitioning to reduce processing time.
The claims appended to this specification particularly point out and distinctly claim the subject matter of the invention. However those skilled in the art will best understand both the organization and method of operation of what the applicant(s) consider to be the best mode(s) of practicing the invention, together with further advantages and objects of the invention, by reading the remaining portions of the specification in view of the accompanying drawing(s) wherein like reference characters refer to like elements.