This invention relates generally to the field of design automation of integrated circuits, and more specifically to the field of performance optimization of digital integrated circuits.
In the field of integrated circuit design, computers are used to automate the design process. Current integrated circuit designs have become so complex that the design process cannot be completed without the aid of computers executing design automation software. Typically, during most of the design process, the integrated circuit design exists only in the form of electronic data, stored in the memory of a computer or some other storage medium.
Integrated circuit design includes several steps. The designer creates the design by specifying the function of the design, typically by composing existing or new electronic components (or cells) having various functions. While some components (or cells) are custom designed specifically for a particular chip, most components are standard and are designed in advance and kept in one or more libraries. The designer creates the desired function by interconnecting, with nets, selected cells. Logic synthesis software aids the designer by performing some of the laborious and repetitive tasks of selection, interconnection and optimization of selected cells from the cell library. The resulting design is represented as a net list that defines the collection of cells and interconnections between the cells.
In a subsequent part of the design process, called the physical design, a set of photographic masks are created for use in manufacturing of a chip. To this end, the cells are placed on the chip area and the interconnections between the cells are routed. Physical design automation software automatically places the cells and routes the connections. Larger cells (or blocks) are usually placed at the periphery of the chip. Most of the cells are small standard cells which have a rectangular shape and a uniform height. These standard cells are typically placed in rows of the same uniform height.
Placer software is used to physically place the cells in the chip layout. For the interconnections between the cells, a number of metal layers, usually between 2 and 7, are available. Routing software is used to construct the interconnection with a variety of rectilinear metal shapes. The physical design ends with the generation of photographic masks describing the layers of the integrated circuit design.
One measure of the performance of a chip is determined by the time required to propagate the signals from register to register. Clock signal control the storing of data into these registers. The number of levels of cells that the signals propagate through and the delay of each of these cells and their interconnections determine the speed of propagation of signals. The number of levels of cells can be reduced during logic synthesis.
Capacitive, resistive and inductive effects cause a delay of the interconnections. For example, as capacitance and resistance increase with the length of an interconnection, the placement of the cells influences the performance of the chip. The capacitance that a cell needs to drive is the sum of the capacitance of the net and the capacitance of the inputs of the other cells connected to that net. If a cell drives a larger capacitance, the delay increases. Using larger transistors, this larger capacitance can be driven with the same delay. That is, the larger transistors can drive larger loads (with the same delay). However, such larger transistors cause the input capacitances of these cells to be larger, thereby slowing down the previous stage in the net list.
A variety of techniques are used for performance optimization in the design of digital integrated circuits. Because the length of the nets is important for the performance, there are placement methods that attempt to optimize the performance, generally referred to as timing driven placement schemes. One such scheme is disclosed is U.S. Pat. No. 5,218,551, entitled xe2x80x9cTiming Driven Placement,xe2x80x9d issued Jun. 8, 1993 in which an attempt is made to place the cells on the chip in such a manner that the nets that limit the performance of the chip to the greatest extent remain as short as possible.
Other methods for optimizing the performance of a chip design are based on the sizing of the transistors. One such method is disclosed in U.S. Pat. No. 5,880,967, entitled xe2x80x9cMinimization of Circuit Delay and Power Through Transistor Sizing,xe2x80x9d issued Mar. 9, 1999. Such methods assume that each transistor can be given an accurate individual size. These methods generally use numerical continuous multiple variable optimization algorithms. While this can lead to very accurate optimization, the wide variety of possible transistor sizes requires many custom cell designs, which can make the design very costly. Most libraries contain few versions (typically 3 or 4) of the same cell, each with different transistor sizes, potentially in parallel within an individual cell. Hence sizing algorithms have been developed that select the best drive strength from the limited available standard drive strengths. An example of one such sizing algorithm is described in U.S. Pat. No. 5,633,805, entitled xe2x80x9cLogic Synthesis Having Two-Dimensional Sizing Progression for Selecting Gates from Cell Libraries,xe2x80x9d issued May 27, 1997.
Other methods for performance optimization are based on the insertion of amplifying standard cells, or buffers. These buffers can drive a large load while presenting a small load at their input. Buffers do not effect the logic function. An example of a buffering algorithm is described in U.S. Pat. No. 5,799,170, entitled xe2x80x9cSimplified Buffer Manipulation Using Standard Repowering Function,xe2x80x9d issued Aug. 29, 1998. Buffering is area efficient in the sense that little cell area is required to drive a large load. Compared to transistor sizing though, an extra stage of logic is added, and sizing cannot entirely eliminate the intrinsic delay of this stage.
Another set of methods for performance optimization in digital IC design is based on duplicating standard cells with multiple fanouts. The capacitance of all the pins of the output net is then distributed among the original cell and the duplicate cells, each of the cells driving a disjunct subset of the fanouts. This technique is known as xe2x80x9ccloning.xe2x80x9d U.S. Pat. No. 5,396,435, entitled xe2x80x9cAutomated Circuit Design System and Method for Reducing Critical Path Delay Times,xe2x80x9d issued Mar. 7, 1999, describes, among other things, one such cloning method.
However, cloning has several limitations including potential difficulty finding an acceptable balance when partitioning fanouts; cloning cannot be used to drive single fanouts with large capacitances (such as primary outputs); cloning cannot be used to drive large capacitances due to long wires; cloning cannot be used for standard cells with bidirectional pins; and cloning cannot easily be used for multi-source nets.
Therefore, what is needed is a scheme for automatically generating circuit designs capable of driving large loads in such a manner as to avoid one or more of the problems or limitations in the art described above.
Methods and apparatuses for automated design of parallel drive standard cells are disclosed. A standard cell having insufficient drive strength to drive a load is identified. The standard cell is duplicated one or more times to provide a set of multiple standard cells having sufficient combined drive strength to drive the load. The set of standard cells are coupled in parallel to drive the load. In one embodiment, the set of standard cells are aligned vertically or horizontally on the integrated circuit to provide a stacked set of standard cells.