1. Field of the Invention
The present invention relates to a pipeline-based circuit. In particular, the present invention discloses a pipeline-based circuit utilizing a postponed clock-gating mechanism for reducing power consumption.
2. Description of the Prior Art
An accurate clock signal is a key factor for a logic circuit to perform a correct logic operation. That is, the clock signal is used to drive kernel circuit units such as counters and registers within the logic circuit, and a stable clock signal such as a clock signal generated from a crystal oscillator always functions as a reference clock to arbitrate operations of the circuit units within the logic circuit. However, all of the circuit units within the logic circuit are not always active. When some of the circuit units enter an idle mode, these idle circuit units do not need to be driven by the clock signal continuously for performing related operations. If the clock signal is still inputted into the idle circuit units, power consumption of the logic circuit is increased unnecessarily. It is well known that the power consumption of the logic circuit is mainly generated from delivering the clock signal to these circuit units and enabling these circuit units to run related logic operations. In order to reduce power consumption of the logic circuit such as a microprocessor, the clock signals are gated from triggering the idle circuit units. Therefore, unwanted power consumption is accordingly eliminated. In other words, the clock signal transferred to an idle circuit unit is first converted to be one signal with a fixed logic value (“1” or “0”). Taking a clock signal that is a square wave for example, the logic value “1” corresponding to a high voltage and the logic value “0” corresponding to a low voltage are alternatively switched. The clock signal is gated after the clock signal is converted to hold either the logic value “1” or the logic value “0”. Because the logic circuit drives one internal circuit unit through a converted clock signal holding a fixed logic value, the operation associated with the circuit unit is blocked. Therefore, the total power consumption of the logic circuit is further reduced. The above-mentioned process is a well-known clock-gating mechanism.
Please refer to FIG. 1 and FIG. 2. FIG. 1 is a schematic diagram of a prior clock-gating circuit 10, and FIG. 2 is a timing diagram of signals of the clock-gating circuit 10 shown in FIG. 1. The clock-gating circuit 10 has a controller 12 and a plurality of logic gates 14a, 14b, 14c. Each of the logic gates 14a, 14b, 14c performs the AND logic operation. The controller 12 includes a plurality of clock control units 13a, 13b, 13c respectively used for generating control signals 15a, 15b, 15c to corresponding logic gates 14a, 14b, 14c. In addition, a system clock generator 16 is capable of generating a clock signal 17 to the logic gates 14a, 14b, 14c. Then, the logic gates 14a, 14b, 14c respectively output clock-gating output signals 18a, 18b, 18c to corresponding logic units 20a, 20b, 20c. The above clock-gating output signals 18a, 18b, 18c are used to drive the logic units 20a, 20b, 20c. 
When the logic unit 20a enters the idle mode, the clock control unit 13a is activated to gate the clock signal 17 through the control signal 15a. Please refer to FIG. 2. During a period t0˜t2, the control signal 15a holds the logic value “1” corresponding to a high voltage. Therefore, the clock signal 17 successfully passes through the logic gate 14a. That is, the waveform of the clock-gating output signal 18a is identical to the waveform of the clock signal 17, and the clock-gating output signal 18a drives the running logic unit 20a successfully. However, when the logic unit 20a does not need to be activated during a period t2˜t4, the clock control unit 13a outputs the control signal 15a with the logic value “0” corresponding to a low voltage. The clock signal 17 is gated through the logic gate 14a. That is, the clock-gating output signal 18a holds the constant logic value “0” during the period t2˜t4, and the operation of the logic unit 20a is interrupted to reduce power consumption. During a period t4˜t5 and a period t7˜t8, the control signal 15a corresponds to the logic value “1” so that the clock signal 17 is inputted into the logic unit 20a again. During a period t5˜t7, the logic unit 20a does not need to be activated. Therefore, the control signal 15a then corresponds to the logic value “0” to gate the clock signal 17 from driving the logic unit 20a for reducing power consumption.
Similarly, with regard to other logic units 20b, 20c, the clock control units 13b, 13c output the control signals 15b, 15c corresponding to the logic value “0” to gate the clock signal 17 through the logic gates 14b, 14c when the logic units 20b, 20c do not need to be activated. For the logic unit 20b, the clock control unit 13b gates the clock signal 17 to reduce power consumption during periods t0˜t1, t3˜t4, t5˜t7. For the logic unit 20c, the clock control unit 13c gates the clock signal 17 to reduce power consumption during a period t5˜t6. Please note that the operations associated with the logic units 20b, 20c are not repeated for simplicity.
For the logic circuit, a pipeline structure, generally speaking, is adopted to improve processing efficiency. Please refer to FIG. 3, which is a block diagram of a prior art pipeline-based circuit 30. The pipeline-based logic circuit 30 includes a plurality of processing units 32a, 32b, 32c, a pipeline control unit 34, and a clock-gating unit 36. Each of the processing units 32a, 32b, 32c includes a logic unit 38a, 38b, 38c and a buffer unit 40a, 40b, 40c. The logic units 38a, 38b, 38c are used to perform predetermined logic operations respectively. For example, the logic unit 38a, 38b, or 38c can be an adder for doing binary addition or a multiplier for doing binary multiplication.
The buffer units 40a, 40b, 40c corresponding to the logic units 38a, 38b, 38c are used to store calculation results outputted from the logic units 38a, 38b, 38c. Then, a calculation result currently stored in one logic unit is passed to a logic unit next to the current logic unit. The buffer units 40a, 40b, 40c can be prior art flip-flops. If the logic unit 38a is used to output a calculation result having a bit length equaling 64, the buffer unit 40 needs 64 flip-flops to store the calculation result. In addition, one clock signal is necessary for controlling the buffer units 40a, 40b, 40c to store the calculation results generated from the logic units 38a, 38b, 38c and controlling the buffer units 40a, 40b, 40c to output the stored calculation results.
The pipeline control unit 34 is used to control the operation of the pipeline established by the processing units 32a, 32b, 32c. As shown in FIG. 3, the pipeline control unit 34 is capable of outputting control signals PA, PB, PC to control the processing units 32a, 32b, 32c. For example, an input data DATA_IN is inputted into the processing unit 32a. Therefore, the logic unit 38a starts processing the input data DATA_IN according to a predetermined logic operation. After the predetermined logic operation is done, the pipeline control unit 34 generates the control signal PA according to current operating modes of the logic units 32a, 32b, 32b, and outputs the control signal PA to the logic unit 32a for activating the buffer unit 40a to store a calculation result generated from the logic unit 32a. At the same time, the stored calculation result is passed to the next logic unit 32b. As mentioned above, one of the logic units 32a, 32b, 32c in the pipeline-based logic circuit 30 may not be used to process the input data DATA_IN. For example, after the input data DATA_IN has been processed by the logic units 32a, 32b, a branch may occur to terminate the process for the input data DATA_IN. Therefore, the logic unit 32b does not need to pass its calculation result to the next logic unit 32c for following operations. For the input data DATA_IN, any logic units following the logic unit 32b do not need to be activated. In other words, the related buffer units do not need to transfer calculation results stage by stage. Therefore, a prior art clock-gating mechanism can be adopted to reduce power consumption of the inactive buffer units.
The clock-gating unit 36 is used to control the clock signals inputted into the buffer units 40a, 40b, 40c positioned in the corresponding processing units 32a, 32b, 32c to achieve the goal of saving power. Generally speaking, the clock-gating unit 36 generates the clock signals CLK_GA, CLK_GB, CLK_GC inputted to the buffer units 40a, 40b, 40c according to a system clock CLK_S and the control signals PA, PB, PC generated from the pipeline control unit 34. The control signals PA, PB, PC are determined according to predetermined conditions. For instance, data transmission statuses associated with a bus and operating statuses of logic units function as the predetermined condition used by the pipeline control unit 34 to output the control signal PB. Please note that the predetermined conditions for the processing units 32a, 32b, 32c may differ from each other. For example, each of the control signals PA, PB, PC comprises a piping enable signal for driving a corresponding logic unit to pipe its calculation result to the next logic unit, and a piping flush signal for nullifying the calculation result generated by the corresponding logic unit. Concerning the logic unit 32b, suppose that the control signal PB itself is a piping enable signal, and corresponds to three conditions A, B, C. That is, the three conditions A, B, C are used to determine whether the piping enable signal is outputted to make the processing unit 32b pipe its calculation result to the next processing unit 32c. The conditions A, B, C are related to the operating statuses of the logic units 32a, 32b, 32c, and the pipeline control unit 34 only uses the conditions A, B, C to set the control signal PB corresponding to the logic unit 32b. The pipeline control unit 34 is capable of determining if the piping enable signal is outputted to the logic unit 32b after the conditions A, B, C have been successfully determined. Therefore, the pipeline control unit 34 needs to wait until all of the conditions A, B, C are determined. That is, the pipeline control unit 34 has to wait a longer period of time before generating the piping enable signal for the processing unit 32b. When the control signal PB is used to drive the prior art clock-gating mechanism, the above-mentioned delay time actually affects the operation of the clock-gating unit 36. The reason is described as follows.
Please refer to FIG. 4 and FIG. 5. FIG. 4 is an example schematic diagram of the clock-gating unit 36 shown in FIG. 3, and FIG. 5 is a timing diagram of signals running in the clock-gating unit 36 shown in FIG. 3. The clock-gating unit 36 has a logic gate 42 and an inverter 44. The logic gate 44 performs an NOR logic operation upon the clock control signal CLK_ENB and the clock signal CLK_S to generate the clock signals CLK_GA, CLK_GB, CLK_GC for the processing units 32a, 32b, 32c. The clock control signal CLK_ENB is generated from a predetermined logic operation upon the piping enable signal and the pipeline flush signal of each processing unit 32a, 32b, 32c. For example, suppose that the clock control signal CLK_ENB for the processing unit 32b is determined by the piping enable signal only. As mentioned above, when all of the conditions A, B, C correspond to the logic value “1”, the clock control signal CLK_ENB is set by the logic value “1” at t4. As shown in FIG. 5, the clock control signal CLK_ENB does not have a transition from the logic value “0” to the logic value “1” until t4. With the processing performed by the clock-gating unit 36, the clock signal CLK_GB has a falling edge at t3, and corresponds to the logic value “0” during a period t3˜t4. Then, the clock signal CLK_GB has a rising edge at t4, and the clock signal CLK_GB holds the logic value “1” after t4. If the processing unit 32b connected to the clock-gating unit 36 is triggered by rising edges of the clock signal CLK_GB, the clock-gating unit 36 should make the clock signal CLK_GB hold the logic value “1” after t0 to gate the clock signal CLK_S. However, because the clock control signal CLK_ENB is late-arrived as shown in FIG. 5, the processing unit 32b is triggered twice respectively at t0 and t4. In other words, the clock signal CLK_GB leaves the logic value “0” at t4. Therefore, the clock signal CLK_GB having the rising edge at t4 is capable of triggering the processing unit 32b, and as such, the clock signal CLK_GB cannot achieve the goal of reducing power consumption.
Furthermore, a glitch is induced to affect the operation of the processing unit 32b. According to the prior art, the period t0˜t1 is defined to be a clock-gating hold time, and the period t2˜t3 is defined to be a clock-gating setup time. In other words, the clock control signal CLK_ENB needs to be inputted before a falling edge of the clock signal CLK_S. Otherwise, the clock signal CLK_GB generates the unwanted glitch during the period t3˜t4. The unwanted glitch induced for each of the clock signals CLK_GA, CLK_GB, CLK_GC likely results in the pipeline-based logic circuit 30 functioning incorrectly.