1. Field of the Invention
The present invention pertains to the field of integrated circuit design.
More particularly, the present invention pertains to a process for optimizing power consumption in the design of complex integrated circuits.
2. Background of the Invention
The use of clock gating is a design technique used in digital integrated circuits to reduce dynamic power. As shown in FIG. 1 depicting circuit (100), the main idea is to replace the last stage of the clock network (not shown) with a gating element (101) that is controlled by an enable signal. The input clock Cin to the gating element (101) is from the clock network and the output clock Cout is used to drive the clock ports (i.e., clock inputs or clock input ports) of one or more storage element(s) (102) in the circuit (100). When the enable signal is asserted, the output clock Cout of the gating element (101) follows the input clock Cin, and the storage element(s) (102) are clocked. This allows the storage element(s) (102) to be updated with the state values at their inputs. Conversely, when the enable signal is deasserted, the output clock Cout is forced either to a logic 1 or logic 0. This results in the storage element(s) (102) retaining their previous state values, and not getting updated with the state values at their inputs. This suppression of clock activity on Cout when the enable signal is deasserted provides the main savings in dynamic power with the use of clock gating. In the example described above, the storage element(s) (102) are said to be clock-gated with the gating element (101) using the enable signal to enable the Cin clock signal for the clock port of the storage element(s) (102). Throughout this document in the same manner as the above description, a name (e.g., Cin) may be used to refer, interchangeably, to an electrical signal (e.g., Cin clock signal), a physical pin carrying the electrical signal (e.g., Cin pin), or a logical port implemented by the physical pin (e.g., clock input port of the gating element (101)). Those skilled in the art will appreciate the storage element(s) (102) may refer to a single storage element with a single-bit input/output or multiple storage elements with multi-bit inputs/outputs without deviating from the spirit of the invention. Throughout this document and in all Figures, the singular term storage element(s), input(s), output(s), and clock(s) are intended to include the plural forms thereof unless specifically excluded.
Clock gating a digital integrated circuit includes (a) identifying the storage elements whose last stage clock input buffer can be replaced by a gating element, and (b) generating the logic for the enable signal of each gating element. It is a requirement to ensure that any suppression of clock activity (or equivalently de-assertion of the enable signal) does not change the functionality of the digital circuit. One clock-gating technique that satisfies this requirement is to identify the condition under which the inputs to the storage elements are identical to the state values already present in the storage elements. This condition is referred to as the quiescent input state condition (or quiescent condition). Under this condition, clocking the storage elements does not change the state values and hence the corresponding clocks to the storage elements can be suppressed for as long as the quiescent input state condition holds. The quiescent condition of a storage element is induced by signals, involved in generating all the data inputs to the storage element, staying unchanged from the previous clock cycle to the current clock cycle with respect to the clock of the storage element. Therefore, a signal is said to be in a quiescence inducing condition in the current clock cycle of the storage element, when it is unchanged from the previous clock cycle to the current clock cycle.
To illustrate this, consider the circuit (200) in FIG. 2. The Cout clock to the storage elements (202) is a buffered version of the Cin clock at the input to the last stage buffer (201) in the clock network (not shown). In this circuit (200), assume the Select signal to the multiplexer (M) is at a logic value of 1 for clock cycle 1, and is at a logic value of 0 for cycles 2 and 3. Since the Select signal is at a logic value of 0, the state value stored in the storage element (202) is fed back as its input (203) via input 0 of the multiplexer (M) in cycles 2 and 3. It is observed that, for clock cycles 2 and 3, the inputs (203) to the storage elements are identical to their stored state values. This satisfies the quiescent condition described above, and hence provides the opportunity to clock-gate these storage elements (202). The signal (203) is said to be in quiescence inducing condition in clock cycles 2 and 3 for the storage element (202).
As shown in FIG. 3, a corresponding clock-gated design (300) is essentially the same as the circuit (200) of FIG. 2 except that a gating element (301) is used to replace the last stage buffer (201) in the circuit (200). The Select signal used in multiplexer (M) can directly be used as the enable signal (denoted as Select (Enable) in FIG. 3) for the clock gating element (301). Note that when the enable signal is asserted high in cycle 1, the Cout clock follows the Cin clock and the storage element (202) is clocked. However in cycles 2 and 3, the clock transitions to the storage element (202) are suppressed. In this example, the quiescent condition of the storage element (202) and the quiescence inducing condition of the signal (203) to enable clock gating may be identified by analyzing the feedback structure of logic around the storage elements (202).
Another quiescent input state condition is illustrated in FIGS. 4 and 5. As shown in FIG. 4, the storage element (A) (register A) is clock-gated with clock gating element (401) which uses the enable signal ENa and clock Cin to derive CoutA as the clock input to the storage element (A). However, the clock CoutB to storage element (B) (register B) is driven by a last stage clock buffer (402) with input Cin.
Example signal waveforms for ENa and Cin and the corresponding behavior of the CoutA and CoutB clocks are shown in FIG. 5. Although Cin and CoutB toggle in all six cycles 1-6, CoutA only toggles for those clock cycles in which ENa is at a logic value of 1, and stays at 0 when ENa is at a logic value of 0. In cycle 2, register A does not get clocked since CoutA is at 0 while register B captures the data value at its input (403). In cycle 3, the input (403) to register B has not changed, since in the previous clock cycle (cycle 2) register A did not get clocked. The signal (403) is said to be in a quiescence inducing condition for register B in cycle 3. Here, cycles 2 and 3 are referred to as the previous and current clock cycles respectively, of the quiescence inducing condition. At the same time in cycle 3, register B satisfies the quiescent input condition since its input is identical to the state value in its storage elements. Hence, the clock CoutB can be suppressed in cycle 3 without changing the functionality of the design.
As shown in FIG. 6A, a corresponding clock-gated design (600) is essentially the same as the circuit (400) of FIG. 4 except that a gating element (601) is used to replace the last stage buffer (402) in the circuit (400). The observation described with respect to FIG. 4 can be generalized to state that the clock CoutB can be suppressed for every clock cycle immediately after a clock cycle in which clock CoutA is suppressed. The enable signal ENb for the gating element (601) used to generate CoutB is generated by delaying ENa by a single clock cycle of Cin. This is accomplished by using a single storage element (C) that is clocked by a buffered version of Cin as shown in FIG. 6. Throughout this document, a first signal is said to be coupled to a second signal if the first signal is a buffered version or an inverted version of the second signal.
Example signal waveforms for both enable signals ENa and ENb and the corresponding clock signals Cin, CoutA, and CoutB are shown in FIG. 6B. Note that CoutB toggles one cycle later for every clock cycle in which CoutA toggles.
The use of feedback analysis (as illustrated in FIGS. 2 and 3) and simple pipelined structures (as illustrated in FIGS. 4-6B) represent two ways to take advantage of the quiescent input state condition to clock-gate storage elements. Besides the requirement to ensure functionality is unchanged, any method to clock-gate a digital circuit must ensure the overall area penalty is minimized. There are two contributors to the area penalty in clock-gating a digital circuit: (i) the area increase when replacing the last stage buffer of the clock network with a clock gating element, and (b) the additional area of the logic created for the enable signals to the gating elements.
To address the area penalty, storage elements are typically not individually clock-gated. As shown in FIG. 6A, register B may represent a set of storage elements where signal (403) may represent signals of an input bus. One skilled in the art will recognize that the set of storage elements in register B may be combined into a single clock-gating group. This enables sharing of both the clock-gating element (601) and the additional logic used to create the enable signal ENb, thus reducing the area penalty compared to clock gating each individual storage element separately. Although the typical size of clock-gating groups can vary for different storage elements in a circuit, a rule-of-thumb used in prior art is to achieve an average clock-gating group size of 16 for the entire circuit.
Methods described above in using the quiescent input condition to clock-gate a circuit are limited. For example, feedback analysis requires the circuit to have storage elements with feedback logic structures. Such feedback structures are absent or used sparsely in certain classes of digital circuits such as those used in networking and data-flow intensive applications. Similarly many circuits do not conform to the simple pipelined topology shown in FIG. 4. The presence (not shown) of multiple registers that source data (403) (instead of a single register (A) as in FIG. 4) or more complex combinational logic between pipeline stages (instead of a single buffer (403) as in FIG. 4) limits the applicability of simple pipelined analysis to clock-gate the destination registers as illustrated in FIGS. 4-6B.
It would therefore be desirable to have a method to clock-gate a digital circuit without solely relying on feedback analysis or simple pipelined structures. Such a method would enable a plurality of storage elements in the circuit to be clock-gated leading to a greater reduction in the total dynamic power of the design. The method should also guarantee that the functionality of the design is unchanged; reduce the area penalty by constructing the enables using minimal number of signals; and allow storage elements to be grouped into reasonable sized clock-gating groups prior to replacing the last stage buffer in the clock network with a clock-gating element.