Reducing power consumption in VLSI circuits has become important for several reasons. Mobile or portable electronic devices, which already account for a significant portion of all consumer electronics sold, are battery driven. Reducing power consumption in the various components of such systems prolongs the life of the batteries, which is highly desirable. Excessive power consumption also leads to an increase in chip packaging and cooling costs, which increase the total system cost. Another benefit of reduced power consumption is increased reliability of VLSI circuits. Reducing average power consumption or peak power consumption have their own merits. For example, reducing average power consumption increases battery life, while reducing peak power consumption reduces packaging and cooling costs. This invention seeks to minimize average power consumption.
Most savings in power consumption can be obtained through a combination of several techniques applied at different levels of the design hierarchy. Several design and synthesis techniques have been proposed for power optimization at the technology. See A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, "Low-power CMOS digital design," IEEE J. Solid-State Circuits, pp. 473-484 (April 1992).
Such techniques have been applied to the transistor level. See S. Devadas and S. Malik, "A survey of optimization techniques targeting low power VLSI circuits," in Proc. Design Automation Conf., pp. 242-247 (June 1995). Physical level of a sequential circuit design is another area where such techniques have been applied. See H. Vaishnav and M. Pedram, "PCUBE: A performance driven placement algorithm for low power designs," in Proc. European Conf. Design Automation, pp. 72-77, (September 1993). Similarly such techniques have also been applied at the logic design level. See S. Devadas and S. Malik, "A survey of optimization techniques targeting low power VLSI circuits," in Proc. Design Automation Conf., pp. 242-247 (June 1995).
On the architectural power estimation front, a method based on a uniform white noise model of signal statistics was presented in S. R. Powell and P. M. Chau, "Estimating power dissipation of VLSI signal processing chips: the PFA technique," in Proc. VLSI Signal Processing IV, pp. 250-259 (1990). A more accurate estimation method based on a dual-bit-type model was presented in P. E. Landman and J. M. Rabaey, "Power estimation for high level synthesis," in Proc. European Conf. Design Automation, pp. 361-366 (February 1993) and P. E. Landman and J. M. Rabaey, "Black-box capacitance models for architectural power analysis," in Proc. Int. Vlkshp. Low Power Design, pp. 165-170 (April 1994). The use of entropy as a measure of average switching activity, and its use in high-level power estimation was suggested in D. Marculescu, R. Marculescu, and M. Pedram, "Information theoretic measures for energy consumption at the register-transfer level," in Proc. Int. Symp. Low Power Design, pp. 81-86 (April 1995) and F. N. Najm, "Towards a high-level power estimation capability," in Proc. Int. Synp. Low Power Design, pp. 87-92 (April 1995) . Early work in architectural power optimization was presented in A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, "Low-power CMOS digital design," IEEE J. Solid-State Circuits, pp. 473-484 (April 1992) and A. P. Chandrakasan, M. Potionjak, R. Mehra, J. Rabaey, and R. Brodersen, "Optimizing power using transformations," IEEE Trans. Computer-Aided Design, vol. 14, pp. 12-31 (January 1995). In A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, "Low-power CMOS digital design," IEEE J. Solid-State Circuits, pp. 473-484 (April 1992) the use of architectural parallelism was proposed based on data path replication and pipelining to enable supply voltage scaling for power reduction. A methodology that used a variety of architectural transformations to reduce power consumption was presented in A. P. Chandrakasan, M. Potionjak, R. Mehra, J. Rabaey, and R. Brodersen, "Optimizing power using transformations," IEEE Trans. Computer-Aided Design, vol. 14, pp. 12-31 (January 1995). In A. Chatterjee and R. K. Roy, "Synthesis of low power DSP circuits using activity metrics," in Proc. 7th Int. Con VLSI Design, pp. 265-270 (January 1994), switching activity metrics were used to reduce power consumption in bit-serial digital filters. Optimizing memory-dominated computations for power consumption was addressed in S. Wytack, F. Catthoor, F. Franssen, L. Nachtergaele, and H. D. Man, "Global communication and memory optimizing transformations for low power systems," in Proc. Int. Wkshp. Low Power Design, pp. 203-208 (April 1994), and D. Lidsky and J. Rabaey, "Low-power design of memory intensive functions," in Proc. Symp. Low Power Electronics, pp. 16-17, (October 1994). Tools for power estimation and design space exploration at the behavior level were presented in R. Mehra and J. Rabaey, "Behavioral level power estimation and exploration," in Proc. Int. Wkshp. Low Power Design, pp. 197-202 (April 1994). In L. Goodby, A. Orailoglu, and P. M. Chau, "Microarchitectural synthesis of performance-constrained, low-power VLSI designs," in Proc. Int. Conf. Computer Design, pp. 323-326 (October 1994), module selection and pipelining were used to combat the performance degradation that results from reducing the supply voltage. Methods for performing allocation and assignment in order to minimize switching activity and switched capacitance in the data path were presented in A. Raghunathan and N. K. Jha, "Behavioral synthesis for low power," in Proc. Int. Conf. Computer Design, pp. 318-322 (October 1994), A. Raghunathan and N. K. Jha, "An ILP formulation for low power based on minimizing switched capacitance during datapath allocation," in Proc. Int. Symp. Circuits & Systems, pp. 1069-1073 (May 1995), J. M. Chang and M. Pedram, "Register allocation and binding for low power," in Proc. Design Automation Conf., pp. 29-35 (June 1995), and A. Dasgupta and R. Karri, "Simultaneous scheduling and binding for power minimization during microarchitecture synthesis," in Proc. Int. Symp. Low Power Design, pp. 69-74 (April 1995). Techniques to reduce power consumption during high level synthesis based on reducing activity in functional units were presented in E. Musoll and J. Cortadella, "High-level synthesis techniques for reducing the activity of functional units," in Proc. Int. Symp. Low Power Design, pp. 99-104 (April 1995). The use of limited-weight codes to minimize power consumption in buses and I/O circuitry was described in M. Stan and W. P. Burleson, "Limited-weight codes for low-power I/O," in Proc. Int. Wkshp. Low Power Design, pp. 209-214 (April 1994). A multi-phase clocking scheme for RTL circuits that reduces activity by naturally imposing shut-off for inactive parts of the circuit was proposed in C. Papachristou, M. Spining, and M. Nourani, "A multiple clocking scheme for low power RTL design," in Proc. Int. Symp. Low Power Design, pp. 27-32 (April 1995). An optimization tool for average and peak power consumption during behavioral synthesis, based on genetic search was described in R. S. Martin and J. P. Knight, "Power Profiler: Optimizing ASICs power consumption at the behavioral level," in Proc. Design Automation Conf., pp. 42-47 (June 1995). Techniques for software power estimation and optimization were presented in V. Tiwari, S. Malik, and A. Wolfe, "Power analysis of embedded software: a first step towards software power minimization," in Proc. Int. Conf. Computer-Aided Design (November 1994).
The importance of eliminating glitches in the design of digital VLSI circuits has been recognized for a long time. Avoiding glitches or hazards is known to be of great importance in a synchronous circuit design and the design of D/A and A/D converters. Several studies have reported the importance of considering glitching power during power estimation and optimization, see M. Favalli and L. Benini, "Analysis of glitch power dissipation in CMOS IC's," in Proc. Int. Symp. Low Power Design, pp. 123-128 (April 1995), and S. Rajagopal and G. Mehta, "Experiences with simulation-based schematic-level power estimation," in Proc. Int. Wkshp. Low Power Design, pp. 9-14 (April 1994). The extreme sensitivity of glitching power to process variations has been pointed out in M. Favalli and L. Benini, "Analysis of glitch power dissipation in CMOS IC's," in Proc. Int. Symp. Low Power Design, pp. 123-128 (April 1995) and F. N. Najm and M. Y. Zhang, "Extreme delay sensitivity and the worst-case switching activity in VLSI circuits," in Proc. Design Automation Conf., pp. 623-627 (June 1995), where it was shown that the switching activity and power consumption due to glitches vary much more with process variations than the other components of power dissipation. The design of a multiplier with significantly reduced glitching power consumption was described in C. Lemonds and S. S. M. Shetti, "A low power 16 by 16 multiplier using transition reduction circuitry" in Proc. Int. Wkshp. Low Power Design, pp. 139-142 (April 1994). However, very few automated design and synthesis techniques exist for reducing glitching power consumption in general circuits. At the architecture and behavior levels, most previous work on power estimation and optimization ignores the effects of glitching, in particular, the effect of glitch propagation across the boundaries of blocks in the architecture has not been considered. While accurate library modeling approaches such as described in P. E. Landman and J. M. Rabaey, "Black-box capacitance models for architectural power analysis," in Proc. Int. Wkshp. Low Power Design, pp. 165-170 (April 1994), can be used to account for the effect of glitches within architectural blocks, they typically assume that inputs to these blocks are glitch-free.
Most previous work at the architecture and behavior levels has also sought to focus on data-flow intensive designs, where arithmetic units like adders and multipliers account for most of the total power consumption. However, the power consumed by the functional units constitutes a small fraction of the total power consumption, while multiplexer networks and registers can consume a major part of the total power for such designs.
A large part of the register power consumption arises due to transitions on the register's clock input. The technique of gating clocks has been used by designers to selectively turn off parts of a system. Methods to automatically detect conditions under which the clock inputs to all the registers in a design can be shut off, based on identifying self-loops and unreachable states in the state transition graph (STG), were presented in L. Benini, P. Siegel, and G. DeMicheli, "Saving power by synthesizing gated clocks for sequential circuits," IEEE Design & Test of Computers, pp. 32-41, Winter 1994. However, the techniques described in the last reference can be applied only to the control and random logic parts of a design for which it is feasible to extract the STG.
An example of an RTL circuit which computes the greatest common divisor (GCD) of two numbers is shown in FIG. 1. The inputs are applied at XIN and YIN, and the GCD is written into register OUTPUT. Since the number of cycles required for computing the GCD depends on the input values provided, an additional output signal RDY indicates when the result is available in OUTPUT. This circuit was derived from a behavioral description of the GCD algorithm. A high-level synthesis system called SECONDS was used to perform resource allocation, scheduling, and assignment to result in the RTL circuit shown in FIG. 1. See S. Bhattacharya, S. Dey, and F. Brglez, "Performance analysis and optimization of schedules for conditional and loop-intensive specifications," in Proc. Design Automation Conf., pp. 491-496 (June 1994); S. Bhattacharya, S. Dey, and F. Brglez, "Clock period optimization during resource sharing and assignment," in Proc. Design Automation Conf., pp. 195-200 (June 1994); and S. Bhattacharya, S. Dey, and F. Brglez, "Provably correct high-level timing analysis without path sensitization," in Proc. Int. Conf. Computer-Aided Design, pp. 736-742 (November 1994).
The circuit shown in FIG. 1 consists of one functional unit--a subtractor, two equal-to (=) comparators, one less-than (&lt;) comparator, registers, multiplexer trees, the controller finite state machine (FSM), and the decode logic. The decode logic generates the control signals that configure the multiplexers in the circuit. The controller FSM and the decode logic are referred to collectively as the control logic of the circuit. The logic expressions implemented by the control logic are also shown in the figure. The literals x0 through x4 represent the decoded present state lines from the controller. Literals c9, c10, and c15 represent results of the three comparators in the circuit.
The RTL circuit shown in FIG. 1 is mapped to a standard library like the NEC CMOS6 library, see CMOS6 Library Manual, NEC Electronics, Inc. (December 1992). A simulation-based power calculation tool is used to measure power consumption in various parts of the design. See CSIM Version 5 Users Manual, Systems LSI Division, NEC Corp., 1993.
Table 1 shown in FIG. 20 provides the split up of the total power consumption into separate figures for functional units (subtractor and three comparators), random logic (controller FSM and decode logic blocks), registers (including power consumed due to clock transitions), and multiplexers. It indicates that most of the power consumption is in the multiplexers and registers. Several circuits that implemented other control-flow intensive and mixed specifications also confirm these results.
Such data has been collected on the transition activity with and without glitches in various parts of the design. The transition activity without (excluding) glitches can be obtained by simulating the circuit under a zero-delay model. The simulations are performed using input vectors that are derived from the test bench for the behavioral specification. Table 2 shown in FIG. 21 shows the total bit transitions with and without glitches for all the control signals, and selected data path signals (CSIM counts each 0.fwdarw.1 or 1.fwdarw.0 transition as half a transition. Hence, the transition numbers that are reported throughout the present disclosure may be fractional). Control signal contr[i] feeds the select input of the multiplexer marked [i] in FIG. 1, where i is an integer between 0 and 9. Similarly, data path signal dp[i] corresponds to the output of the multiplexer marked [i] in FIG. 1. Clearly, a significant portion of the total transition activity at several signals in the circuit is due to glitches. Several control signals in the GCD circuit, like contr[2] and contr[4] are highly glitchy. The generation of glitches on control signals will be analyzed below, and it will be illustrated, that control signal glitches can have a profound effect on the glitching power consumption in the rest of the circuit.
The following example illustrates how ignoring glitches result in designs that have sub-optimal power consumption.