1. Technical Field
The present invention relates generally to a system and method for designing and placing circuitry on semiconductor chips, and more particularly, to a system and method for incorporating a timing-closed placement solution into a physical design process of integrated circuitry.
2. Description of the Related Art
The development of electronic manufacturing technology has created the ability to build thousands of circuits on a single chip. To take advantage of this technology, thousands of circuits must by physically placed and connected on the chip. This can be a very time-consuming process, especially when the actual process of designing, placing and connecting the circuits on the chip can affect the performance and timing requirements of the chip. Therefore, it has become necessary to automate the design process by using a computer to quickly place and wire predesigned circuits into a functional chip.
The basic problem with this automation technique is that it sacrifices the performance of the resulting circuit for the ability to get a connected circuit in a reasonable amount of computing time. When the functional chip being designed is a central processing unit of a computer or other chip in which performance is critical and design complexity high, the performance sacrificed is not acceptable and the automation technique is not useful. This performance sacrifice usually manifests itself in the inability to obtain timing closure in complicated logic. Timing closure is the difference between the time allowed for processing information on the chip as logically designed, and the time required for processing information on the chip as physically designed.
Timing closure is not met when the chip as physically wired and placed is not as fast as required by the logical design.
With advances in VLSI technology, the size of modules in integrated circuits is becoming smaller and the density of modules on a chip is increasing. Consequently, intramodule delays are becoming smaller, and the total delay in the circuit is being dominated by delays in the interconnections between the modules. The communication-bounded nature of total circuit delay, along with more stringent timing requirements due to more aggressive design style, have made timing driven layout an important area of study. To meet the needs of a fast-expanding electronics industry, high performance chips must be designed in a short period. Accordingly, a design flow which incorporates timing analysis and verification into the physical design is desirable. This motivates the development of layout tools which optimize layout area and timing simultaneously.
The problem of timing-driven placement has been studied extensively over recent years. Existing timing-driven approaches can be broadly classified into net-based methods and path-based methods. In a net-based algorithm, timing constraints are first translated into physical constraints, such as upper and lower bounds on the lengths of nets. More specifically, net-based algorithms try to satisfy timing constraints by (1) assigning higher weights to nets which are part of critical paths, or (2) by transforming timing requirements into a set of upper bounds on the net delays. In scenario (1), minimizing the delay in a critical net may increase the delay in other nets. This may result in additional critical paths and the delays of the nets in these paths also then have to be minimized. This again may result in an excessive delay in the previous critical net. It is desirable to prevent this oscillating effect. In scenario (2) above, delay constraints on the paths are translated into either length or timing lower and upper bounds (slacks) for each net. The bounds are then used to guide the placement and routing. Timing driven placement optimization will not shorten nets that are below the threshold, but nets near or above the threshold are very strongly weighted for improvement. A major problem of these approaches is the selection of the weights or bounds. Also, the use of individual net bounds may overconstrain the problem.
Path-based methods consider timing requirements explicitly, and try to satisfy timing requirements and physical requirements simultaneously during the placement phase. A major difficulty encountered in path-based methods is the enormous complexity of computation. Path-based approaches overcome these difficulties via an optimization process which models the problem using paths instead of individual nets. The problem may be modeled as a linear programming or transforming the quadratic programming problem into a Lagrangian problem to reduce the number of constraints. However, this optimization process becomes very complex and time consuming in deep sub-micron designs.
A legal (or feasible) solution to the timing-driven placement problem should satisfy the following placement constraints: (1) Macros should be placed at legitimate locations without overlapping, (2) there should be sufficient space to implement interconnections, (3) timing constraints should be satisfied for all logically possible paths in the circuit, (4) region constraints should be satisfied, i.e., some modules may be placed only in an certain regions, for example, (a) for movable I/O pins (input/output terminals): some I/O pins"" positions may be fixed, others may be assigned to any of the available I/O pads, (b) locations of some modules may already be fixed.
An input to a timing-driven placement problem is a set of modules and a net list, where each module has a fixed shape and fixed terminal locations. The goal is to find the best position for each module on the chip according to appropriate cost functions. Timing driven placement incorporates timing objective functions into the placement problem. Nets that must satisfy timing requirements are called critical nets. In timing-driven placement, it is desirable to make critical nets timing-efficient and other nets length- and area-efficient.
In a net-based timing-driven layout, timing requirements are usually first translated into physical requirements. Delay slacks correspond to budget wiring delays. Slack is the difference between the designed (logical) delay and the actual delay (after added wiring delay) from the wiring program. A positive slack implies that the current cycle-time is fulfilled by the physical layout (i.e., the net meets the design criteria), while a negative value indicates that the layout violates the timing conditions. In addition, a large positive value indicates that the cycle-time can be further improved. Hence, the goal in timing-driven layout is to maximize the min-slack.
The delay budgeting problem seeks to allocate delay slacks before the placement and routing steps. Thus, as a result of delay budgeting, the performance-driven placement and routing steps are given net delay bounds. Since the delay slacks equate with wiring delay, it is natural to expect all nets to have positive slacks. Furthermore, the distribution of these slacks determines the difficulty of finding a feasible placement (and/or routing) solution.
Excessive local congestion gives rise to future routing difficulty and also increases potential crosstalk noise in high-speed signal lines. Furthermore, it increases power dissipation due to coupling capacitance. In a timing analysis of a prerouting design, the routing of a net is usually assumed to be a minimal rectilinear Steiner tree. Due to the congestion, the capacitance (i.e., wirelength) of this routing tree is larger than the one with a minimal Steiner tree. Thus, we need to avoid the timing-critical nets from the congested areas.
Existing timing-driven flows lead to unpredictable and suspicious timing results. Their main flaw is a lack of timing coverage which requires designers to spend days or even weeks iterating between synthesis and layout to achieve timing closure. Extremely complex deep submicron designs requires a new placement algorithm being completed with faster clocks.
There have been many works in timing-driven placement in recent years. Recent results are mainly categorized as: A) Top-down hierarchical partitioning (slack-based), B) quadratic programming (path-based), and C) constructive approaches.
A. Top-Down Hierarchical Partitioning
In top-down hierarchical partitioning, the length of all interconnections are estimated provided that entire cells assigned to a partitioned region are located at the center of the region. Therefore, after each cut of a min-cut algorithm, a global routing is computed. See J. Garbers, B. Korte, H. J. Promel, E. Schwietzke, and A. Seger, VLSI-Placement Based on Routing and Timing Information, IEEE, 1990. This provides an expected net length for every net. These net lengths are subsequently used to perform a timing analysis. In particular, increasing the weight of some nets should lead to a shorter realization of these nets and thus should increase the minimum slack. In this algorithm, modules are not placed at upper levels of the mincut partitioning; the exact module placement is realized at the bottom of hierarchy. Thus, it is hardly guaranteed that the expected net length computed at each level of the hierarchy is consistent with the net length obtained by final placement.
FIG. 1 illustrates an exemplary conventional net-weight and mincut based placement approach. One example of this approach, hierarchical mincut-based partitioning, involves dividing a circuit into smaller parts, recursively. The object is to partition the circuit into parts such that the sizes of the partitions are within a prescribed range and the number of connections between components is minimized at each level of hierarchy. This results in minimizing the number of global wires and accordingly, maximizing the number of local wires, thus minimizing the total wirelength. During the partitioning, if module m1 in partition 100 is moved to partition 102, the result is an undesirable solution since the critical net C with its timing budget of 1 unit is on the cutline 104 and thus may span the entire chip region in a worst case scenario. On the other hand, if module m2 is moved, then the timing budget of net D becomes over-weighted in a smaller wiring region. Therefore, there is a need for a more insightful timing budget management strategy.
In M. Marek-Sadowska and S. P. Lin, xe2x80x9cTiming-Driven Placementxe2x80x9d, IEEE Conference, pp. 94-97, 1989, the timing-driven placement problem was formulated as a facility location problem, for example, for m old facilities located on a plane, locations of additional n-m new facilities are sought. The objective is to minimize the sum of weighted (net-weight based) rectilinear distances between them. Solutions to the problem produce placements of cells only at coordinates of the old facilities (for example, cells with fixed locations such as input/output (I/O) pads). In order to decompose cells into two partitions in the plane, fictitious terminals are added at the cutline that partitions the netlist into two equal-sized netlists.
Usually, bi-partitioning and clustering-based partitioning approaches attempt to cluster critical nets in a local region so that most of the critical nets can reside in close proximity, but some critical paths can easily be divided into different partitions that span a timing-specifically unbounded routing region (i.e., a region where timing is not satisfied). In T. Koide, et al. xe2x80x9cA New Performance Driven Placement Method with the Elmore Delay Model for Row Based VLSIsxe2x80x9d, Hiroshima Univ. koide@ecs.hiroshima-u.ac.jp, during 4-way partitioning, while moving the cell, slack gain is computed, and the cells connecting nets with large slack gains on the cutline may span the timing specifically unbounded routing region. To decrease the delay time of the paths, the cells are moved into clusters within a partition so that nets connecting the cells will span a smaller routing region. However, this method does not guarantee that the final layout of a net does not exceed the timing slack.
B. Quadratic Programming Lagrangian relaxation offers an alternative to simulated annealing for controlling the tradeoff between the system cycle time and wirelength. A. Srinivasan, K. Chaudhary, and E. S. Kuh, xe2x80x9cRITUAL: A Performance-Driven Placement Algorithmxe2x80x9d, IEEE Trans. on Circuits and Systems II, Vol. 39, No. 11. pp. 825-840, November 1992, presented such a mathematical programming approach such that the runtime is smaller than simulated annealing and the quality of the results are reasonable. However, issues like congestion analysis and routability factors are not considered. Routability constraints are among the most difficult because they are not analytical and are checked only by means of routing. This is a major reason why the routability constraints are not included into the mathematical programming formulations.
Another technique involves an algorithm which uses an iterative approach. See A. Mathus and C. L. Liu, Compression-Realization: A New Approach to Timing-Driven Placement for Regular Architectures, IEEE TCAD, Vol. 16, No. 6, June 1997. In each iteration, there is a compression phase and a relaxation phase. The compression phase attempts to make the placement delay feasible by compressing the long paths that cause some of the primary output signals to arrive too late. However, the compression phase may produce an infeasible placement with some of the slots occupied by two modules. This allows the compression phase more flexibility, and often allows it to achieve the required decrease in delay. If an infeasible placement is produced in the compression phase (path-based), the relaxation phase (net-based), which carries out a timing-driven reconfiguration of the infeasible placement to produce a feasible solution, will be executed. Forming a slack neighborhood graph, the delays in the critical paths are guaranteed not to increase beyond a certain bound. It captures,the freedom of movement of the modules, without xe2x80x9cviolating the timing constraints.xe2x80x9d If the compression phase produces an infeasible placement, the original modules occupying the overcrowded slots need to be relocated. In the relaxation phase, relocation is carriedout simultaneously for all of the modules in such a way that the delays do not increase by too much. The slack of an edge measures the amount by which the delay of the edge can be increased without violating any timing constraints. The slacks of the edges incident to a module determines the neighborhood within which the module can be moved without violating the timing requirements. In any iterative algorithm for placement, it is initially essential that the mobility of the modules be sufficiently high. This ensures that a bad initial placement does not cause the algorithm to get stuck in a high-cost local minimum. In order to prevent the mobility from being completely governed by the slacks, a relaxation parameter was introduced that allows the algorithm to increase the values of edge slacks which will be referred to as relaxed slacks. In order to incorporate a routability measure into the placement process, each edge of the slack neighborhood graph (SNG) is associated with a cost that measures the penalty, in terms of an increase in congestion, that results from the move associated with that edge. A reasonable measure of this penalty is a congestion gradient that measures the difference in congestion in different areas of the current placement. This approach tries to satisfy the timing constraints for most critical paths, but after spreading out the overlapped modules, it is not guaranteed that the final placement satisfies the timing constraints for entire critical nets.
In most of these timing criticality-based approaches, some of the non-critical nets can turn into the critical nets due, to the unbounded treatment for the wirelength of non-critical nets. In recent aggressive designs, most of the nets are critical and thus a priority-based approach may not be effective.
C. Constructive Approaches
A successive augmentation approach has also been proposed which adds one macro at a time to a partial placement until all macros are exhausted. There are two stages. The first technique involves adaptive changing of parameters according to evaluations of partial solutions. The second technique is carried out by an adaptive look-ahead procedure for improving global characteristics of the placement. The adaptive algorithm uses adaptation of parameters to handle a wider range of operating controls. A set of adjustable parameters such as a timing budget are used to control placement. This approach is effective when dynamic adjustment process can be realized in a reasonable amount of computation. However, this approach lacks global optimization.
Another technique involves a constructive approach based on a path-delay timing window. See I. Lin and D. H. C. Du, xe2x80x9cPerformance-Driven Constructive Placementxe2x80x9d, Design Automation Conference, pp. 103-106, 1990. This approach considered a path with a sequence of modules along the path. All modules in the path are bounded in a rectangle called a window to satisfy the timing requirement. Even if all modules are inside the region, a zig-zag routing may result. The basic idea is to define an area to guide the placement of the first module in the window such that the total interconnect delay can be minimized. The net constraints are used to reduce the placement constraints instead of directly using complete path constraints. Once a cell location is determined in a window, all associated paths are broken into two sub-paths.
Previous works lacked the ability to deal with the timing constraints in terms of paths. For example, a timing driven placement method has been presented based on a path delay relaxation force (PDRF) method. The delay of a timing-critical path having a small timing margin is minimized by placing the cells on the path (called path core cells) at the center of gravity, and this process is performed for other path core cells. However, these approaches are only concerned with the timing-critical paths. The cells on the non-critical paths must be treated carefully since their placement may cause further timing problems in recent high performance designs. To deal with this problem, the net constraint driven placement can be utilized. However, the main problem with net constraints is that timing constraints are path based, hence net bounds are usually over-constraining, resulting in infeasible placements. As a result, methods of handling over-constrained net bounds have been proposed but usually rely on re-budgeting only after a physical design step (placement) is completed.
Accordingly, there is a need for a system and method for a very large scale integration (VLSI) placement that efficiently increases production capacity of integrated circuits and accurately optimizes the integrated circuit design.
The present invention is directed to a system and method for timing-closed placement which also takes wirelength and congestion into consideration. The system and method of timing driven placement according to the present invention incorporates a timing budget management technique which satisfies triangle parity and inequality, a timing-driven quadrisection placement strategy based on flexible timing window configurations to minimize the wirelength and congestion during each mincut quad-partition of top-down hierarchy, and a linear programming formulation incorporating bin capacity, channel capacity and congestion criticality. Advantageously, these features allow good timing-closed placement results to be achieved without excessive computation time, thus accelerating the sign-off-to-silicon cycle for customers and increasing production capacity.
In an aspect of the present invention, a method for placing circuit elements on semiconductor chips is provided comprising the steps of: creating a circuit graph including cutlines, said circuit graph comprising said circuit elements connected by nets: for placement on a placement grid; clustering critical nets in the circuit graph; assigning a timing budget for each net using at least one of a plurality of slack distribution algorithms satisfying at least one geometric constraint; partitioning the circuit graph using a mincut algorithm; generating a timing window region on the placement grid for each net which is less than or equal to each net""s respective timing budget; and assigning the circuit elements attached to each net to each of their respective timing window regions.
In another aspect of the present invention, a method for determining placement of circuit elements is provided comprising the steps of: describing a circuit image as a graph comprising circuit elements connected by edges; assigning a timing budget for each edge using a geometry-aware slack distribution algorithm which satisfies at least one geometric constraint; generating a timing window region on a placement grid for each edge, said timing window region being equal to or less than the timing budget for the respective edge; and assigning the circuit elements attached to each edge to each of their respective timing window regions.
These and other aspects, features and advantages of the present invention will be described or become apparent from the following detailed description of the preferred embodiments, which is to be read in connection with the accompanying drawings.