Most digital integrated circuits today are fabricated using a CMOS process. One of the reasons that CMOS has become prevalent is because it dissipates less power than competing technologies, but now as chip speeds rise, even CMOS chips consume too much power.
Clock gating is a well-known technique in conventional synchronous CMOS circuits.
A clock is a global signal, distributed to all storage elements which are conventionally implemented as D-type flip-flops. In conventional circuits without clock gating, storage elements that do not need to capture new data are either disabled explicitly or a multiplexor is used to feed the current output of the storage element back to the input. These two options are shown in FIG. 1. FIG. 1 also illustrates terminology which will be mentioned here because it is used in the following, a denotes a data input to the D-type flip-flop D-FF, where as b denotes the data output. Input data a is supplied to an input data terminal and the output data b is taken from an output data terminal. An enable signal is supplied to an enable input of the D-type flip-flop, and a clock signal φ is supplied to the clock terminal. The diagram on the right hand side of FIG. 1 shows a multiplexer M receiving the input data a and the output data b and being controlled by the enable signal.
Applying a clock input to a flip-flop that does not change its output is a waste of power, so schemes have been developed that avoid useless clocking. One such scheme is shown in FIG. 2. An AND gate 100 only applies the clock φ when the flip-flop D-FF is enabled, and a transparent TL latch is used to make sure that glitches on the enable wire carrying the enable signal do not cause unintentional clock pulses on the clock input to the flip-flop.
This scheme has been proposed in various academic papers, such as:                Automatic Insertion of Gated Clocks at Register Transfer Level, N. Raghavan, V. Akella and S. Bakshi, Proceedings of the Twelfth International Conference on VLSI Design, 1999, pp 48–54        Symbolic Synthesis of Clock-Gating Logic for Power Optimization of Synchronous Controllers, L, Benini, G. De Micheli, E. Macii, M. Poncino, R. Scarsi, ACM Transactions on Design Automation of Electronic Systems, vol 4 no 4, October 1999        Synthesis of Low-Power Selectively-Clocked Systems from High-Level Specification, L. Benini and G. De Micheli, ACM Transactions on Design Automation of Electronic Systems, vol 5 no 3, July 2000        
Another power saving technique is referred to herein as “guarding”. Logic blocks in CMOS circuits only consume appreciable power when their inputs change. It is possible to reduce the power taken by the whole circuit if the inputs to small portions of the circuit are only permitted to change when the outputs of that small portion are needed. FIG. 3 shows the basic idea.
The output of the logic block L may not always be clocked into the flip-flop. If the inputs to the logic change when the output is not required, energy will be needlessly, consumed inside the logic block.
Guarding solves this problem by placed additional guarding logic GL in the form of additional gates between the inputs of the logic block L and the registers (D-flip flops) that supply their output data b to those inputs, as shown on the right of FIG. 3. These additional gates, block any inputs to the logic L that do not produce a useful output! but the inputs are delayed by passing through the additional logic, and this can affect the speed of the circuit. Techniques for low-power design need to avoid slowing down the logic, because speed is almost always important. If speed is not important, it is easier to trade speed for power by simply lowering the supply voltage Vdd.
The additional gates GL inserted for guarding can be either simple gates (AND or OR) or transparent latches, but there advantages and disadvantages associated with each:                AND and OR gates have a small additional delay, and can often be absorbed into the logic block at the technology mapping stage. Unfortunately, they do not block input transitions, but simply force the output to one rail or the other—this can often lead to more input changes than without the guarding logic, which then outweighs the power saving.        Transparent latches block input changes effectively, but they have significant propagation delay, and this is likely to slow down the circuit.        
Guarding using AND gates and OR gates is an established technique. Guarding using transparent latches is mentioned in the following two papers.                Guarded Evaluation: Pushing Power Management to Logic Synthesis/Design, V. Tiwari, S. Malik, P. Ashar, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol 17 iss 10, October 1998        Automating RT-Level Operand Isolation to Minimize Power Consumption in Datapaths, M. Munch, B. Wurth, R. Mehra, J. Sproch, N. When, Proceedings of the Design Automation and Test in Europe conference (DATE 2000), March 2000.        
In the early days of synchronous circuits, two-phase clocking schemes were used [Introduction to VLSI Systems, C. Mead and L. Conway, Addison Wesley 1980]. The storage elements used were transparent latches TL, with alternate latch banks clocked off opposite phases of the clock. A complete cycle of the circuit consists of a rising edge on φ1, a falling edge on φ1, a rising edge on φ2 and then a falling edge on φ2. The two-phase clocking scheme is shown in FIG. 4.
As clock speeds rose, it became more difficult to distribute a pair of high-speed clocks with the correct timing relationships, so single-phase clocking started to dominate. Today, single-phase clocking is universally employed. Single-phase design still uses two latches per stage, as shown in FIG. 5, but the latches are combined into a single storage element, known as a D-type flip-flop. It is to be noted that in the present description φ is used to denote a clock input which provides both phases and which is conventionally referred to in the art as clk.
The D-type is considered to be a single state-holding element comprising two transparent latches, with the state held at the output of the right-hand transparent latch. In a clock-gated chip using D-types, the left-hand latch is only used to stop shoot-through, and there is no useful state held on the internal node of the D-type.
It is an aim of the present invention to allow a circuit using D-type flip-flops to be converted into a circuit using guard-flops without requiring any specialist knowledge on the part of the designer.