The present invention relates generally to design tools for integrated circuits. More specifically, but without limitation thereto, the present invention relates to a method of distributing a clock signal among clock buffers in a balanced clock tree that minimizes clock skew for an integrated circuit design.
Integrated circuits typically include blocks or multiple circuit elements such as flip-flops. The circuit elements are generally synchronized by a common clock signal from clock buffer cells. The clock buffer cells are typically arranged in a balanced clock tree. A balanced clock tree is constructed in a hierarchy of buffer levels, and each buffer level contains one or more partitions of clock buffers. The top buffer level is the clock driver level, which contains a driver (a high power buffer cell) driven by a system clock. The next buffer level is the clock repeater level, which contains clock repeaters (medium power buffer cells) driven by the clock drivers. The remaining lower buffer levels contain clock buffers (standard power buffer cells) down to buffer level L1, which contains the clocked circuit elements.
In previous approaches to balanced clock placement, the number of clock buffers driven by repeaters is minimized, while insertion delays of buffers at each lower buffer level xe2x80x9cdownstreamxe2x80x9d are ignored. The inability to estimate maximum and minimum delays accurately in buffer groups results in unbalanced partitioning with large clock skew and insertion delays. The unbalanced partitioning typically requires delay balancing by extra wire insertion, resulting in large errors in Elmore delay calculations relative to SPICE delay calculations.
A circuit may be partitioned in a single iteration, called one-pass partitioning, or the circuit may be partitioned by an algorithm that examines all cells in several iterations. A partition of a circuit into two parts is called two-way cutting. Two-way cutting may be repeated to further partition a circuit so that each partition contains a set of cells or buffers having a minimum skew. One-pass partitioning based on two-way cutting does not generally produce good solutions to balanced clock placement in production designs.
Further, heuristic objective functions used to place clock buffers in groups of circuit elements result in a large clock skew. Heuristic objective functions are quality functions that describe an objective or goal indirectly. An example of a heuristic objective function used to place clock buffers in groups of circuit elements is the minimization of the distance between a buffer location and the center of mass of a group of cells driven by the buffer. The real objective of balanced clock buffer placement is the minimization of clock skew between the clock buffer and each cell in the group.
The present invention advantageously addresses the problems above as well as other problems by providing a method of clock buffer partitioning that minimizes clock skew for a balanced clock tree.
In one embodiment, the present invention may be characterized as a method of clock buffer partitioning to minimize clock skew in an integrated circuit design that includes the steps of receiving as input a description of a number of clock buffers for buffering a system clock to a plurality of clocked circuit elements; constructing a balanced clock tree from the description wherein the balanced clock tree includes a plurality of buffers in a hierarchy of buffer levels; partitioning each of the hierarchy of buffer levels into a plurality of buffer groups wherein clock skew in each of the plurality of buffer groups at each buffer level is substantially minimized; routing a clock input to a plurality of buffers within at least one of the plurality of buffer groups in at least one of the hierarchy of buffer levels to construct a zero clock skew among the plurality of buffers; calculating an estimated group insertion delay for the at least one of the plurality of buffer groups as a sum of an internal insertion delay and a downstream insertion delay of one of the plurality of clocked circuit elements; and generating as output the estimated group insertion delay.
In another embodiment, the present invention may be characterized as a a computer program product for clock buffer partitioning to minimize clock skew in an integrated circuit design that may be implemented by a computer to perform the following functions: receiving as input a description of a number of clock buffers for buffering a system clock to a plurality of clocked circuit elements; constructing a balanced clock tree from the description wherein the balanced clock tree includes a plurality of buffers in a hierarchy of buffer levels; partitioning each of the hierarchy of buffer levels into a plurality of buffer groups wherein clock skew in each of the plurality of buffer groups at each buffer level is substantially minimized; routing a clock input to a plurality of buffers within at least one of the plurality of buffer groups in at least one of the hierarchy of buffer levels to construct a zero clock skew among the plurality of buffers; calculating an estimated group insertion delay for the at least one of the plurality of buffer groups as a sum of an internal insertion delay and a downstream insertion delay of one of the plurality of clocked circuit elements; and generating as output the estimated group insertion delay.