1. Field of the Invention
The present invention relates in general to tools for clock trees for distributing clock signals within integrated circuit (ICs) and in particular to a method for estimating path delays in a clock tree having an independently designed subtree.
2. Description of Related Art
A netlist describes an integrated circuit (IC) by listing instances of standard circuit components (“cells”) such as gates and transistors that are to be included in the IC, referencing the nets (signal paths) that convey signals between the cell instances, and indicating which cell instance terminals are to be connected to each net. An automated placement and routing (P&R) tool processes a netlist to produce a placement plan indicating where each cell instance is to be positioned within an IC die and a routing plan indicating how the nets interconnecting the cell instance terminals are to be routed through the various metal layers of the die.
Since the time a P&R tool needs to generate an acceptable placement and routing plan increases rapidly with the number of cell instances to be placed, one way to reduce processing time is to reduce the number of cells that must be placed, and one way to do that is incorporate large “macro-cells” into the IC design. A macro-cell describes the layout of a relatively large block of IC logic formed by many smaller cell instances. For example a design for an IC including an embedded random access memory (RAM) usually employs an instance of a macro-cell to implement the RAM. Since the layout for cells forming a macro-cell is predetermined, when a P&R tool lays out an IC incorporating an instance of a macro-cell, it need only incorporate the pre-determined macro-cell layout into an area of the die reserved for the macro-cell and then route nets between the terminals of the macro-cell and other cells of the IC. It need not determine how to place and route the individual cells forming the macro-cell.
FIG. 1 illustrates an example of an IC layout 10 incorporating an instance of a macro-cell 11 and a set of cell instances 12 forming portions of the IC not implemented by the macro-cell. Although for simplicity, FIG. 1 shows the IC as having only one macro-cell instance and a relatively small number of cell instances 12, a typical IC design may employ more than one macro-cell and a much larger number of cell instances.
While the use of larger macro-cells to implement large blocks of logic in an IC can reduce the time needed to generate a layout, complications arise when a macro-cell implements logic that must be synchronized to logic implemented by cells outside the macro-cells. Various blocks of logic in a synchronous logic circuit transmit logic signals to one another via clocked devices (“sinks”) such as registers, latches and flip-flops so that the signals each logic block transmits and receives change state only in response to edges of the clock signals that clock the sinks. This ensures that state changes in the signals various logic blocks use to communicate with each other occur at predictable times so that the logic operations of those logic blocks are synchronized.
The sinks are clocked by edges of the clock signals, and to ensure that all signals passing though sinks change state at substantially the same time, it is necessary to ensure that clock signal edges arrive at the sinks with a timing variation (skew) that is within some small, predetermined limit. An external clock signal generator typically supplies a clock signal as input to an terminal of the IC that is connected to a root of a “clock tree”, a branching network for distributing the clock signal from its root to all sinks within the IC that are clocked by edges of that clock signal. FIG. 2 depicts a simple clock tree 14 for delivering a clock signal arriving at a node A to several sinks 16. Clock tree 14 includes a set of buffers and/or inverters 18 for providing the power needed to fan out the clock signal at the clock tree's branch nodes. Additional buffers 20 are inserted into various branches of clock tree 14 to adjust the path delays so that clock signal edges arrive at all sinks 16 at the same time. Although for simplicity clock tree 12 is depicted as having only two branching levels and supplying a clock signal to only nine sinks 16, clock trees frequently have many more branching levels and can supply clock signals to thousands of sinks.
While a typical netlist initially lists the cell instances forming the logic of an IC, it does not list instances of buffer and inverter cells forming a clock tree because the clock tree can be designed (synthesized) only after a P&R tool has generated an IC layout indicating positions within the die of all of sinks 16 that are to receive the clock signal. At that point the P&R tool can employ a clock tree synthesis (CTS) tool to synthesize a separate clock tree for each of the IC's clock signals. A CTS tool typically tries to position the buffers 18 at a clock tree's branching nodes so that the clock signal travels approximately the same distance from the clock tree's root 19 to each sink 16, but that alone will usually not keep clock signal skew within acceptable limits. Thus a CTS tool will also insert one or more buffers or inverters 20 into various branches of the clock tree as necessary to balance the path delays between clock tree root 19 and sinks 16. The path delay through any branch of the clock tree is a function of the amount of time the clock signal needs to charge the inherent capacitance of the conductors forming the branch when changing state. A buffer or inverter 20 inserted into a clock tree branch can reduce the path delay through the branch by providing additional current for charging path capacitance more quickly. A CTS tool can finely adjust path delays by appropriately choosing the size, number and positions of buffers or inverters in each branch of the clock tree.
A macro-cell implementing synchronous logic must include an internal clock tree for delivering a clock signal to its sinks. Since the internal layout of the macro-cell is fixed, it is not necessary for a clock tree synthesis tool to generate a clock tree for the macro-cell. However when sinks both inside and outside of the macro-cell are to be clocked by the same clock signal, then the CTS tool that synthesizes a clock tree for the portion of the IC external to the macro-cell, must link the root of the macro-cell's internal clock tree to the synthesized clock tree so that sinks both inside and outside the macro-cell receive that clock signal. The clock tree within the macro-cell therefore becomes a “subtree” of a larger IC clock tree, and it is necessary for the CTS tool to take path delays through the subtree into account when balancing the larger clock tree.
For example FIG. 3 depicts a clock tree 22 for an IC in which a macro-cell instance provides its own internal subtree 24. Since the layout of subtree 24 within the macro-cell instance is fixed, the CTS tool must design the remaining portions of clock tree 22 to account for the clock signal path delays through subtree 24 to ensure that sinks inside and outside the macro-cell receive clock signal edges concurrently.
To determine how to place and size buffers and/or inverters 25 so as to properly balance clock tree 22, the CTS tool must be able to estimate path delays through all branches of clock tree 22 outside the macro-cell. These path delays depend on impedances of the conductors forming each branch of the clock tree and on impedances and switching speeds of buffers or inverters 25. A resistance/capacitance (RC) extraction tool can analyze a clock tree layout to determine path impedances. A CTS tool uses that information, together with information it obtains from a cell library regarding the impedances and switching speeds of the buffers and inverters, to estimate the clock signal rising and falling edge path delays from the root 23 of the clock tree to each node of the clock tree outside of subtree 24. The CTS tool must also be able to estimate the path delays between the root 26 of subtree 24 and the sinks within the macro-cell, but since the layout of subtree 24 is fixed, the CTS really need only be able to determine the maximum and minimum possible rising and falling edge delays between subtree root 26 and the sinks within the macro-cell served by that subtree. The CTS tool could estimate path delays within subtree 24 in the same way it estimates path delays outside the subtree, based on path and buffer impedances and switching delays. But since the CTS tool cannot alter subtree 24, it is not necessary for the CTS tool to know the path delay through each branch of subtree 24. To balance clock tree 22, it is only necessary for the CTS tool to know only the maximum and minimum rising and falling edge delays between subtree root 26 and any sink served by the subtree.
Thus a macro-cell designer may provide an IC design not only with the macro-cell design, but also with a model of each clock tree within the macro-cell indicating the maximum and minimum rising and falling clock signal edge delays through the subtree. When the IC designer thereafter incorporates an instance of the macro-cell into an IC layout, a CTS tool synthesizing a clock tree for the entire IC need only consult the model for the macro-cell's subtree to obtain the information it needs regarding maximum and minimum clock signal path delays through the subtree determining how to balance the clock tree.
As illustrated in FIG. 4, a falling clock signal edge requires some amount of time (INFT) to fall high to low, and some amount of time (INRT) to rise from low to high, and macro-cell designers know that path delays through a macro-cell's clock tree are not fixed but depend to some extent on the transition times of the clock signal's rising and falling edges as they arrive at the subtree's root. The rising and falling edge transition times INRT and INFT for a clock signal at any node of clock tree 22 are in turn functions of the impedances and switching characteristics of the conductors and buffers that deliver the clock signal to that node. When the maximum and minimum rising and falling edge delays though subtree 24 as indicated by the subtree model are estimated based on particular values of INRT and INFT, then when clock tree 22 delivers a clock signal to subtree root 26 that happens to exhibit other INRT and INFT values, the subtree model will provide the CTS tool with an inaccurate estimate of maximum and minimum rising and falling delays though subtree 24.
Referring to FIG. 5, to address this problem, a prior art macro model generator 28 processes the subtree layout to determine the maximum and minimum rising and falling edge delays (respectively) through the subtree for each of several different combinations of INRT and INFT. The results are then incorporated into a macro model 30, suitably a simple lookup table for reading out values of MMAXRD, MMINRD, MMAXFD, and MMINFD as functions of INRT and INFT. When a CTS tool thereafter estimates rising and falling edge path delays INRD and INFD to each node of clock tree 22 of FIG. 3, based on characteristics of the clock signal path leading to that node, it also estimates the values of INFT and INRT at each node of the clock tree. Then when the CTS tool needs to know the maximum and minimum rising and falling edge path delays from subtree root 26 to any sink served by subtree 24, it can use the computed values of INFT and INRT at the subtree root as inputs to the macro model 30 for subtree 26. The macro model then returns values of MMAXRD, MMINRD, MMAXFD, and MMINFD that are appropriate for the INFT and INRT values for the clock signal at the root of subtree 24.
This type of prior art macro model has been useful, but with increasing IC size and clock signal frequency, large discrepancies have begun to arise between the values of MMAXRD, MMINRD, MMAXFD, and MMINFD for a subtree that a macro model predicts for given values of INFT and INRT and the actual values of MMAXRD, MMINRD, MMAXFD, and MMINFD the subtree exhibits when placed in an IC. The errors arise because the rising and falling edge transition times INFT and INRT for the clock signal at the root of the subtree are not the only aspects of the clock signal that can substantially affect path delays through the subtree.