1. Technical Field
The present invention relates in general to integrated circuit design methods and in particular to a design method for buffer insertion within integrated circuits. Still more particularly, the present invention relates to a method for optimizing buffer selection using downstream xcfx80-models.
2. Description of Related Art
Scaling process technology into the deep submicron regime has made interconnect performance more dominant than transistor and logic performance. With the continued scaling of process technology, resistance per unit length of the interconnect continues to increase, capacitance per unit length remains roughly constant, and transistor or logic delay continues to decrease. This trend has led to the increasing dominance of interconnect delay over logic delay. Process technology options, such as use of copper wires, can only provide temporary relief. The trend of increasing interconnect dominance is expected to continue. Timing optimization techniques, such as wiresizing, buffer insertion, and sizing have gained widespread acceptance in deep submicron design (see Cong et al. J. Cong, L. He, C.-K. Koh, and P. H. Madden, xe2x80x9cPerformance Optimization of VLSI Interconnect Layoutxe2x80x9d, Integration: the VLSI Journal, 21, 1996, pp. 1-94. In particular, buffer insertion techniques can significantly reduce interconnect delay. To the first order, interconnect delay is proportional to the square of the length of the wire. Inserting buffers effectively divides the wire into smaller segments, which makes the interconnect delay almost linear in terms of its length, though buffer delays must now be considered. Buffers can also be used to fix slew, capacitance, and noise violations while reducing power, resulting in automated buffer insertion becoming increasingly pervasive as the ratio of device to interconnect delay continues to decrease.
Buffer insertion has been an active area of study in recent years. Closed formed solutions have been proposed by Adler and Friedman, xe2x80x9cRepeater Design to Reduce Delay and Power in Resistive Interconnectxe2x80x9d, IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Vol. CAS II-45, No. 5, pp. 607-616, May 1998; Alpert and Devgan, xe2x80x9cWire Segmenting For Improved Buffer Insertionxe2x80x9d, 34th IEEE/ACM Design Automation Conference, 1997, pp. 588-593; and Dhar and Franklin, xe2x80x9cOptimum Buffer Circuits for Driving Long Uniform Linesxe2x80x9d, IEEE Journal of Solid-State Circuits, 26(1), 1991, pp. 32-40, all of which consider inserting buffers on a 2-pin net. Chu and Wong, xe2x80x9cClosed Form Solution to Simultaneous Buffer Insertion/Sizing and Wire Sizingxe2x80x9d, International Symposium on Physical Design, 1997, pp. 192-197, proposed a closed form solution to simultaneous wiresizing and buffer insertion. The works of Culetu et al., xe2x80x9cA Practical Repeater Insertion Method in High Speed VLSI Circuitsxe2x80x9d, 35th IEEE/ACM Design Automation Conference, 1998, pp. 392-395, Kannan et al., xe2x80x9cA Methodology and Algorithms for Post-Placement Delay Optimizationxe2x80x9d, 31st IEEE/ACM Design Automation Conference, 1994, pp. 327-332; and Lin and Marek-Sadowska, xe2x80x9cA Fast and Efficient Algorithm for Determining Fanout Trees in Large Net-worksxe2x80x9d, Proc. of the European Conference on Design Automation, 1991, pp. 539-544, teach inserting buffers on a tree by iteratively finding the best location for a single buffer. Approaches which simultaneously construct a routing tree and insert buffers have been proposed by Kang et al., xe2x80x9cDelay Bounded Buffered Tree Construction for Timing Driven Floorplanningxe2x80x9d, IEEE/ACM Intl. Conf. Computer-Aided Design, 1997, pp. 707-712; Lillis et al., xe2x80x9cSimultaneous Routing and Buffer Insertion for High Performance Interconnectxe2x80x9d, Proc. 6th Great Lakes Symposium on Physical Design, 1996, pp 7-12; and Okamoto and Cong, xe2x80x9cInterconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertionxe2x80x9d, Fifth ACM/SIGDA Physical Design Workshop, 1996, pp. 1-6. Chu and Wong, xe2x80x9cA New Approach to Simultaneous Buffer Insertion and Wire Sizingxe2x80x9d, IEEE/ ACM International Conference on Computer-Aided Design, 1997, pp. 614-621, present an iterative optimization which simultaneously performs wiresizing and buffer insertion on a 2-pin net.
In 1990, Van Ginneken, xe2x80x9cBuffer Placement in Distributed RC-tree Networks for Minimal Elmore Delayxe2x80x9d, Proc. International Symposium on Circuits and Systems, 1990, pp. 865-868, proposed a dynamic programming algorithm which finds the optimal solution using the Elmore wire delay model and a linear gate delay model. The algorithm only permits a single, non-inverting buffer type to be considered. Several extensions and variants have been proposed to this fundamental approach, Alpert and Devgan, xe2x80x9cWire Segmenting For Improved Buffer Insertionxe2x80x9d, 34th IEEE/ACM Design Automation Conference, 1997, pp. 588-593; Alpert, Devgan and Quay, xe2x80x9cBuffer Insertion for Noise and Delay Optimizationxe2x80x9d, 35th IEEE/ACM Design Automation Conference, 1998, pp. 362-367; Lillis, xe2x80x9cTiming Optimization for Multi-Source Nets: Characterization and Optimal Repeater Insertionxe2x80x9d, 34th IEEE/ACM Design Automation Conference, 1997, pp. 214-219; Lillis et al., xe2x80x9cOptimal Wire Sizing and Buffer Insertion for Low Power and a Generalized Delay Modelxe2x80x9d, IEEE Journal of Solid-State Circuits, 31(3), 1996, pp. 437-447; Lillis et al., xe2x80x9cSimultaneous Routing and Buffer Insertion for High Performance Interconnectxe2x80x9d, Proc. 6th Great Lakes Symposium on Physical Design, 1996, pp 7-12; and Okamoto and Cong, xe2x80x9cInterconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertionxe2x80x9d, Fifth ACM/SIGDA Physical Design Workshop, 1996, pp. 1-6. Lillis et al., xe2x80x9cOptimal Wire Sizing and Buffer Insertion for Low Power and a Generalized Delay Modelxe2x80x9d, IEEE Journal of Solid-State Circuits, 31(3), 1996, pp. 437-447, extended Van Ginneken""s algorithm to simultaneously perform wiresizing and buffer insertion with a buffer library that contains both inverting and non-inverting buffers. In addition, Lillis et al. show, in xe2x80x9cOptimal Wire Sizing and Buffer Insertion for Low Power and a Generalized Delay Modelxe2x80x9d, how to control the total number of buffers inserted and how to integrate input slew into the gate delay function. Later, Lillis showed in xe2x80x9cTiming Optimization for Multi-Source Nets: Characterization and Optimal Repeater Insertionxe2x80x9d, how to modify Van Ginneken""s algorithm to handle nets with multiple sources. Alpert and Devgan proposed in xe2x80x9cWire Segmenting For Improved Buffer Insertionxe2x80x9d a wire segmenting pre-processing algorithm to handle the one-buffer-per-wire limitation of Van Ginneken""s algorithm, which results in a smooth trade-off between solution quality and run time. Alpert et al. showed in xe2x80x9cBuffer Insertion for Noise and Delay Optimizationxe2x80x9d, how to simultaneously modify the algorithm to avoid coupling noise while only suffering a slight delay penalty.
All of the variants to Van Ginneken""s algorithm and most other works in buffer insertion (with the exceptions of V. Adler and E. G. Friedman, xe2x80x9cRepeater Design to Reduce Delay and Power in Resistive Interconnectxe2x80x9d, and S. Dhar and M. A. Franklin, xe2x80x9cOptimum Buffer Circuits for Driving Long Uniform Linesxe2x80x9d), use both simplified gate and wire delay models. The Elmore delay model can significantly overestimate interconnect delay, as it incorporates only the first moment of the impulse response. Similarly, using lumped capacitance instead of effective capacitance can overestimate delay by ignoring resistive shielding, as described in Qian, Pullela, and Pillage, xe2x80x9cModeling the xe2x80x9cEffective Capacitancexe2x80x9d for the RC Interconnect of CMOS Gatesxe2x80x9d, IEEE Trans. Computer-Aided Design, 13(12), 1994, pp. 1526-1535. As the driver resistance becomes comparable to the resistance of the interconnect it drives, some of the downstream capacitance becomes shielded from the gate. In effect, the driver is not driving the entire downstream lumped capacitance but rather an effective capacitance that is less than the total lumped capacitance. It has been empirically shown that using an effective capacitance with k-factor equations is within 10% of SPICE simulation.
FIG. 1 illustrates the magnitude of the errors that can be obtained from simple delay models in a simple RC network. The RC network consists of resistor R1 having a value of 0.1 kxcexa9 between nodes N1 and N2 and resistor R2 with a value of 1.0 kxcexa9 between nodes N2 and N3. Capacitor C1, having a value of 100ff, is displaced between node N2 and ground, while capacitor C2 having a value of 100ff, is displaced between node N3 and ground. Given an input slew of 300 ps at node N1, RICE (a reduced order interconnect analyzer) from Ratzlaff and Pillage, xe2x80x9cRICE: Rapid Interconnect Circuit Evaluator using Asymptotic Waveform Evaluationxe2x80x9d, predicts a 10 ps delay from N1 to N2 and a 697 ps delay from N1 to N3. The corresponding Elmore delays are 110 ps and 1110 ps, respectively. Hence, the Elmore delay is wrong by more than a factor of ten for the delay from N1 to N2. Liu et al., Disclose in xe2x80x9cDesign and Implementation of a Global Router Based on a New Layout-Driven Timing Model with Three Polesxe2x80x9d, concur that Elmore delay causes over 100% overestimation error when compared to SPICE.
The total lumped capacitance seen at node N1 is 1100 ff, whereas for a step input, RICE predicts an effective capacitance of 158 ff. Since gate delays are roughly linear with respect to capacitance, using lumped instead of effective capacitance could lead to an error of a factor of seven. Therefore, previous works on buffer insertion utilize a linear delay model.
Using inaccurate delay models can hurt buffer insertion algorithms in two ways. First, since they only roughly correspond to the true delay, even optimal solutions for these inaccurate models may be inferior when considering the true delay. Second, inaccurate delay modeling can cause a poor evaluation of the trade-off between the total number of buffers and the improvement in delay. For example, one might conclude from inaccurate delay modeling that inserting one buffer reduces the delay by 2 ns, when it actually reduces the delay by only 1.5 ns. If the net""s slack is xe2x88x921.7 ns, then one would conclude from the inaccurate delay models that inserting a single buffer would be sufficient to meet timing constraints. However, the new slack would not be +0.3 ns, but xe2x88x920.2 ns, i.e., timing constraints are still not satisfied.
The present invention discloses a new buffer insertion algorithm which improves Van Ginneken""s algorithm by using both accurate interconnect and gate delay models. In one embodiment of the present invention, the improvements are general enough to apply to all of the extensions to Van Ginneken""s algorithm that have been proposed previously, e.g., noise avoidance, simultaneous tree construction, handling inverting buffers, and wiresizing. For interconnect delay, the present invention computes moments via a bottom-up incremental technique; it performs moment matching to compute two poles and residues; and then it computes delay using Newton-Raphson iterations. For gate delays, the present invention stores the downstream driving point admittances, i.e., xcfx80-models, at each node in the tree, then propagates these xcfx80-models up the tree in accordance with experimental results on several nets in an industry design which demonstrate that the runtime penalties for using the improved wire and gate delay models are not prohibitive. Furthermore, using the present invention produces buffered nets with significantly better slack along the critical paths than those produced by Van Ginneken""s algorithm.