The present invention relates generally to integrated circuit design. More specifically, the invention relates to mechanisms for timing optimization.
FIG. 1 illustrates a conventional integrated circuit design flow 100. Initially, the circuit""s behavior is described in a high level language in a design entry procedure 102. Logic synthesis tools then transform the high level description into a listing of logic cells (logic netlist) and interconnection information in a synthesis procedure 104. The logic cells correspond to cells within a standard cell library. In general terms, logic circuit synthesis generates an initial circuit topology that satisfies the basic logic requirements as defmed by the high level design description. The initial design can be presented graphically as a schematic and also in a data file listing the included logic elements and their interconnections. This data file is generally referred to as a netlist.
The timing optimization procedures (106, 110, and 114) are described further below, and a description of such optimization procedures is skipped for now so as to more clearly describe the other operations of the design flow. After synthesis, the cells listed in the layout netlist are obtained from the standard library and arranged within a design layout in a placement procedure 108. The placed cells are then routed together in operation 112. The cells are placed and routed together into a layout design that is equivalent to the original design description, as well as the layout design netlist. Said in another way, the layout netlist (and original high level description) is in effect transformed into a design layout having interconnected cells.
Logic synthesis tools map functional groups within the high level description to cells having the same logic function. The standard cell library typically provides a set of discrete implementations of each logic function. The different implementations of a particular logic function are designed to drive different capacitive loads while maintaining similar rise/fall times for multiples of a standard load, usually one, two, and four. Unfortunately, the different implementations typically have different associated delays.
As shown, timing optimization procedures are typically performed after the synthesis procedure 104, after the placement procedure 108, and after the routing procedure 112. If timing requirements are not met in any of the timing optimization procedures (e.g., 106, 110, or 114), the entire design flow from timing synthesis 104 through timing optimization 114 are repeated. This reiteration of all or part of the design flow may occur numerous times until the timing goals are met. Unfortunately, these reiterations are usually associated with significant design time and costs.
The various optimization techniques operate on the netlist to attain a satisfactory balance between different requirements. Timing assurance especially depends on the process technology and placement of the circuit design. Timing assurance in synthesis has traditionally operated on a discrete set of cell types with a discrete set of drive capabilities. A static timing analysis-based tool is typically used to select among cells of different drive capabilities. Two important parameters that control this selection are the intrinsic load of the cells on the driver cell, which is well known, and also the load of the interconnect wires, which is not known until the final layout of the design and strongly depends on the placement and routing stages.
Wire load is often estimated in the absence of any placement and routing information. These estimates are typically done without any knowledge of the eventual placement of the logic design and, accordingly, deviate significantly from the actual loads. This creates what is called the xe2x80x9ctiming closurexe2x80x9d problem where several iterations are done between the synthesis-based timing optimization and placement until timing constraints are satisfied. To compensate for the changes in the wire loads, the most common techniques include replacing an existing cell with one of higher drive but the same functionality, and duplicating a cell and splitting the original cell""s load over the resulting pair.
The netlist changes as the load conditions change. These changes may require additional cells and the removal of other cells (e.g., if they are found to be redundant). One such change usually triggers more changes. There is only a discrete set of sizes available, and after replacement, signal delays through the cells are affected differently when they are replaced with other cells. Changes in sizing of replacement cells vs. replaced cells also impacts the delays of the cells""preceding driver cell(s) because of changes in the input capacitive loading. A replacement cell may have a shorter delay, but its driver may have a longer delay, offsetting any gains in the accumulated delay for the path. The process of finding the right cell is also computationally expensive, as all drive strength choices are typically tried out for every stage of the signal path. This makes the use of more drive strengths in a cell library impractical.
Various solutions have been proposed for the timing closure problem, which include better wire load prediction, integrated timing optimization and placement, xe2x80x9clogical effortxe2x80x9d and xe2x80x9cgainxe2x80x9d based timing optimization. The integrated placement and timing optimization compromises the quality of both timing and placement. xe2x80x9cLogical effortxe2x80x9d and xe2x80x9cgainxe2x80x9d based optimization over-constrain the placement stage to produce the right wire loads. These last two approaches attempt to avoid the computational cost of using discrete drive strengths by building continuous models of cells, mapping the delay capabilities of the cells versus their size, and using the resulting simplified models to solve the timing optimization problem. After solving the problem using these simplified models, cells based on continuous parameters are mapped to discrete components from the cell library. The wire load values are still required to be known. The quality of the modeling, mapping delay capabilities versus their size, and the number of the discrete choices in the library are critical issues with these approaches. In sum, these two approaches are often computationally complex and inaccurate.
Accordingly, there is a need for an improved design methodology where timing optimization and placement can be preformed independently in the most efficient manner and an initial timing optimization is performed independently of the wire loads.
Accordingly, the present invention provides improved apparatus and methods for generating design netlists which meet timing and performance specifications of a circuit design. Preferably, power usage is kept low, and the design, placement and routing procedures of chip design flow are kept as independent as possible. These improved apparatus and methods rely on a special cell library having the property of constant replacement delays, as defined below, and preferably a relatively large, but still discrete choices of drive strengths. A set of logic cells is called a xe2x80x9cfamilyxe2x80x9d if each member implements the same logic function, but the members of a particular family may have different electrical properties, like signal delays depending on the drive and load specifications. In one embodiment of a constant replacement delay cell library, members of each logic family share the same logic function, and the same set of replacement delays, which may be different for different signal paths. Each member of the family is designed to drive a specific load from a sorted set of uniformly increasing loads. Replacement delays of each member under this specific load is the same when the member cell is driven by a typical driver. A cell family thus has a load range, a set of fixed replacement delays for the particular load for each member, and a common set of timing constraints, such as setup and hold times, corresponding to the worst cases among the family members. In one example, the number of cells in each family could be 15-20, in contrast to current constant output rise/fall time libraries which are about 4-5. Each member has a xe2x80x9cload slackxe2x80x9d, which is the difference of the maximum load range of its family and its xe2x80x9cnatural loadxe2x80x9d, for which it is has been designed and optimized.
In one embodiment, a design netlist is provided. For example, the design netlist was generated by using a high level description language to describe the design circuit""s behavior and the high level description functions were then grouped and mapped to standard cells to form the design netlist. The standard cell library could be a subset of the special library. Each cell in the netlist could be replaced with the appropriate member of the logic family depending on the load it needs to drive. The loads at this point include only the self loading of the cells themselves due to their input or gate capacitances. At the end of this process, where timing requirements and wire loads are ignored, every cell is matched to its capacitive loading. If the cells are power optimized, the circuit is the most power efficient implementation. Each cell is driving its xe2x80x9cnatural loadxe2x80x9d and has the maximum xe2x80x9cload slack.xe2x80x9d One objective in this embodiment is to find an implementation which will maximize the xe2x80x9cload slackxe2x80x9d for every cell output. If some cells are overloaded, with little xe2x80x9cload slack,xe2x80x9d their load can be split either among more instances of themselves or multiple instances of buffers. In this implementation, one goal is to maximize the load slack for each cell. This procedure is referred to herein as the xe2x80x9cdriver replacement.xe2x80x9d
Timing optimization can be performed on the resulting netlist after xe2x80x9cdriver replacement.xe2x80x9d Unlike traditional approaches to timing optimization where the goal is to find a cell for a particular expected load, one goal of timing optimization in this embodiment is to find an implementation which allows for maximum xe2x80x9cload slack.xe2x80x9d Wire load constraints are then determined for each of the outputs of the logic cells within the design netlist. Placement and routing is then performed based on the predetermined wire load constraints to generate the design layout. A second timing optimization is then performed based on the actual wire loads resulting from the routing procedure. This procedure involves replacing a particular cell with another cell from the same family to drive the total load of gate capacitances and interconnect wires, and matches the replacement delay of the original cell with its own replacement delay for the new total load.
Since the replacement cell has the same replacement delay as the replaced cell, it is likely that the timing requirements produced by the first timing optimization are still being met by the resulting netlist after the second timing optimization. However, since timing may be affected by the interconnect effects, variations in the input slopes and variations in the output drive of the cells, which are all second order effects, the timing is preferably re-verified. If timing is still not met, the design layout may have to be adjusted by further cell replacements.
In another embodiment, a method of generating an integrated circuit (IC) layout design is disclosed. An initial layout netlist having a plurality of original cells is provided. A first original cell within the initial layout netlist is replaced with a first replacement cell having a different drive than the first original cell""s drive but a same replacement delay as the first original cell when the first original cell is not optimal. The first replacement delay of a particular cell is the particular cell""s total delay contribution to a particular delay path that includes the particular cell.
In another embodiment, the invention pertains to a computer readable medium containing program instructions for generating an integrated circuit (IC) layout design. The computer readable medium includes computer readable code for providing an initial layout netlist having a plurality of original cells and for replacing a first original cell within the initial layout netlist with a first replacement cell having a different drive than the first original cell""s drive but a same replacement delay as the first original cell when the first original cell is not optimal. The first replacement delay of a particular cell is the particular cell""s total delay contribution to a particular delay path that includes the particular cell. The computer readable medium further includes a computer readable medium for storing the computer readable codes.
In yet another embodiment, a computer system operable to generate an integrated circuit (IC) layout design is disclosed. The computer system includes one or more processors and one or more memory. At least one of the processors and memory are adapted to provide an initial layout netlist having a plurality of original cells and replace a first original cell within the initial layout netlist with a first replacement cell having a different drive than the first original cell""s drive but a same replacement delay as the first original cell when the first original cell is not optimal. The first replacement delay of a particular cell is the particular cell""s total delay contribution to a particular delay path that includes the particular cell.
In yet another aspect, the invention pertains to a method for performing timing optimization. A design netlist having a plurality of cells and associated timing requirements is provided. Timing optimization is performed on the design netlist by ignoring wire load estimates for each cell within the design netlist and maximizing load slack associated with each cell.
In another embodiment, the invention pertains to a method of optimizing a design circuit having a plurality of cells. The design circuit is associated with timing requirements. A constant replacement delay cell library having a plurality of cell families that each include a plurality of cells having different load capabilities and a same replacement delay is provided. A replacement delay of a particular cell within the constant replacement delay cell library is the particular cell""s total delay contribution to a particular delay path that includes the particular cell. The design circuit is optimized to meet the timing requirements of such design circuit by replacing one or more cells of the design circuit with cells from the constant replacement delay cell library. The design circuit is optimized for power consumption by replacing one or more cells of the design circuit with cells from the constant replacement delay cell library.
In yet another aspect, the invention pertains to a method for generating a logic family of constant replacement cells. A maximum load, a minimum load, an incremental load, and a single replacement delay are selected for the family. The single replacement delay for the family may vary for different signal paths through the cell, but remains constant across the family. A plurality of standard library cells to be included within the family are generated. Each cell has the selected replacement delays, a same logic function and is capable of driving different loads. The loads associated with the family cells range from the selected minimum load through the selected maximum load in increments of the selected incremental load.
These and other features and advantages of the present invention will be presented in more detail in the following specification of the invention and the accompanying figures which illustrate by way of example the principles of the invention.