This Non-provisional application claims priority under 35 U.S.C. § 119 (a) on Patent Application No. 94115696 filed in Taiwan on May 13, 2005, the entire contents of which are hereby incorporated by reference.
1. Field of Invention
The invention relates to a pipelined datapath and, in particular, to a pipelined datapath with dynamically reconfigurable pipeline stages.
2. Related Art
Portable electronic devices become popular in recent years. Since the portable electronic devices often rely on batteries, how to lower the power consumption is an important issue in the circuit designs. Increasing the efficiency of battery uses will also increase the competitive power of the portable devices.
Traditionally, most of the low power circuits are designed by optimizing their static conditions. That is, the circuit design consideration is based on the worst operating environment. However, such design logic cannot fully satisfy the consumer's needs as the portable electronic devices are demanded to have higher efficiencies and lower power dissipation. Therefore, structure designers propose to dynamically reduce the power consumption according to the operating environment using a power-aware system. In other words, the conventional lower-power designs focus on reducing the work voltage, minimizing logic switching, and simplifying the circuit complexity. Even lower power consumption can be achieved by further using a power-aware datapath (e.g. dynamically reconfiguring the datapath imposed with the work voltage according to the currently processing task).
Take the pipeline structure as an example, the conventional low-power design can increase the throughput of the datapath and reduce the required work voltage. Pipeline registers are added to effectively avoid extra or unnecessary logic switching, i.e. short-time pulse glitches. However, the price one has to pay is to waste some power on such registers. Therefore, it is necessary to provide a good method to reduce the power consumption on the registers.
Various methods have been proposed to reduce the power consumption in the registers. One method is to utilize clock gating in the pipeline structure, as shown in U.S. Pat. No. 6,247,134 B1 and the articles by Xanthopoulos et. al. (see Thucydides Xanthopoulos and Anantha P. Chandrakasan, “A Low-Power IDCT Macrocell for MPEG-2 MP@ML Exploiting Data Distribution Properties for Minimal Activity,” IEEE Journal of Solid-State Circuits, Vol. 34, No. 5, P693-P703, May, 1999). With reference to FIGS. 1 and 2, in a three-stage pipeline structure each stage is a combinational logic circuit 111, 112, 113 linking to a register 121, 122, 123. The input terminal of the pipeline circuit is provided with a register 120. Each register 120 to 123 is controlled by a clock signal CK0 CK1, CK2, CK3 generated by a clock generator 130 according to the input data DATA. That is, when the input datum is valid (non-zero datum), then the registers 120 to 123 move the datum from left to right in a lock-step way. In each clock cycle, the registers 121 to 123 latch the output of the combinational logic circuits 111 to 113 according to the received clock signals CK1 to CK3 for processing the datum. When an invalid datum is received (e.g. the datum in a multiplier datapath with a vanishing coefficient), the registers hold the output of the combinational logic circuits to avoid power consumption. In other words, the pulse of each clock signal is transmitted stage by stage with the valid data. When invalid data enter, the registers are not activated to achieve the power-saving effect. In such clock gating method, each valid datum has to pass the register in each stage of pipeline in order to go from the input terminal to the output terminal. However, in the applications that do not require high throughput, there is still unnecessary waste due to extra clock pulses. This is because no fast yield is required and the design considers the worst operating conditions.
Another method utilizes a reconfigurable pipeline structure, which reconfigures the stages of pipeline structure according to the yield requirement. That is, when the dynamical yield requirement is low, the pipeline stages can be reduced; otherwise, the pipeline stages are increased in advance. Such an example is given in Kim et. al. (Suhwan Kim and Marios C. Papaefthymiou, “Reconfigurable Low Energy Multiplier for Multimedia System Design,” Proceedings of the IEEE Computer Society Annual Workshop on VLSI (WVLSI'00), p. 129, Apr. 27-28, 2000). With reference to FIG. 3, when the yield requirement is low, the clock gates 1300, 1301 to 130N−1, 130N turn off some registers and skip them using multiplexers 141 to 14N−1. Therefore, the number of registers can be reduced to save power needed for data access. In fact, the registers 120 to 12N are still controlled synchronously by the system clock CK and move data from left to right in the lock-step way. In each clock cycle, the registers 121 to 12N latch the output of the combinational logic circuits according to the valid data signals DATA_VALID for performing data processing. When the datum passes a register that is turned off by the clock gate, a select signal SEL controls the multiplexer for the datum to bypass the register. Here the clock gates 1300 to 130N control the use of the system clock CK to determine the on and off of the registers 120 to 12N. Although this structure can make adjustments according to different applications, the adjustments require the knowledge of the valid data ratio at the input terminal for determining the pipeline stages. Therefore, the design is very complicated. To reconfigure the stages, one has to be sure that all data existing in the pipeline registers have to be finished before switching to another stage state, ensuring the data accuracy.
Since the conventional methods of reducing the power consumption of the registers have some problems, how to reduce the power consumption of the registers is thus an important study subject.