1. Field of the Invention
The present invention generally relates to a dynamic clock control apparatus and method in a pipeline system. More particularly, the present invention relates to an apparatus and method for increasing system performance by controlling a clock signal of a pipeline structure.
2. Description of the Related Art
With reference to FIG. 1, a typical synchronous digital system will be described below. Referring to FIG. 1, a conventional synchronous digital system has an input port, an output port, a combinational logic circuit 110 for processing data for a predetermined purpose, an input register 102, and an output register 104. The combinational logic circuit 110, the input register 102, and the output register 104 are located between the input port and the output port. Furthermore, the input register 102 and the output register 104 are configured with flip-flops or latches that are synchronized by a clock signal. The conventional synchronous digital system also includes a clock generator 100 for generation of the clock signal that is provided to the input register 102 and the output register 104.
To improve performance, most processors or digital blocks use a ‘pipeline structure’ as illustrated in FIG. 2. The pipeline structure divides the combinational logic circuit 110 of the digital system illustrated in FIG. 1. FIG. 2 illustrates the configuration of a conventional synchronous digital system with a pipeline structure, i.e. a conventional synchronous pipeline system. Referring to FIG. 2, the combinational logic circuit of the synchronous digital system includes four divided combinational logic circuits 222, 224, 226 and 228, each having input and output registers that are formed with flip-flops. More specifically, combinational logic circuit 222 includes register 202 as an input register and register 204 as an output register, combinational logic circuit 224 includes register 204 as an input register and register 206 as an output register, combinational logic circuit 226 includes register 206 as an input register and register 208 as an output register and combinational logic circuit 228 includes register 208 as an input register and register 210 as an output register. As illustrated, each of registers 204, 206 and 208, which are located between combinational logic circuits, functions as an output register for the previous combinational logic circuit and an input register for the following combinational logic circuit. As further illustrated, the combinational logic circuit of FIG. 2 also includes a clock generator 200 for generation of a clock signal to be provided to the registers.
The synchronous digital system of the pipeline structure illustrated in FIG. 2 allows for an increase in an operation speed (clock speed) as well as the processing of a plurality of data simultaneously because the four-pipeline structure of FIG. 2 can perform up to four successive processes at the same time.
As illustrated in FIG. 2, the conventional synchronous pipeline structure is designed such that a clock signal input to each pipeline register is identical in frequency and phase. Therefore, the frequency of the clock signal is determined by a stage taking the longest processing time among pipeline stages each being defined by a combinational logic circuit and input and output registers. For example, if the combinational logic circuits 222, 224, 226 and 228 forming stage 1 to stage 4 take processing times of 11, 13, 16, and 9 nsec, respectively, stage 3 with the longest processing time determines the clock speed of the system ( 1/16 nsec=62.5 MHz).
One reason that causes the different processing times in the different stages is that a different function is performed in each stage. One of many techniques for minimizing the difference between processing times will be described below with reference to FIG. 3.
FIG. 3 illustrates the configuration of a conventional synchronous pipeline system similar to that of FIG. 2. However, the pipeline system of FIG. 2 further provides for reducing the processing time difference between stages by delaying a clock signal. Referring to FIG. 3, a time delay (Td) 302 delays a clock signal generated from a clock generator 200 by a time period and provides the delayed clock signal to one or more registers, thereby narrowing the difference in processing time between stages. For example, if the synchronous pipeline system illustrated in FIG. 3 has a target operation frequency of 100 MHz (a period of 10 nsec) and processing time estimates of first to fourth stages are 8, 9, 11, and 7 nsec, respectively, a clock signal input to a register 208 between stage 3 and stage 4 is artificially delayed by 1 to 3 nsec (herein, the medium value 2 nsec, by way of example). The artificial delaying of the clock signal input to the output register 208 of stage 3 by Td (2 nsec) prolongs the processing time of stage 3 to Tclk+Td. This method for borrowing time to be used for stage 4 for use in stage 3 is called a time borrowing technique. The time borrowing technique is widely adopted for a high-performance digital microprocessor or a digital system block.
In order to effectively apply the time borrowing technique, the estimation of the processing time of each stage must be accurate. This is also true for other techniques used to narrow a time difference between stages. However, since the techniques for reducing a time difference between stages are applied during a chip design, that is, before actual time differences can be measured on the fabricated chip, it is difficult to estimate the processing time accurately. Moreover, as a chip fabrication processes become highly divided, the impact of process tolerances increases rapidly. Also, an increased integration leads to more severe electrical coupling from a neighbor circuit. As a consequence, it is more difficult to determine the physical characteristics of a chip accurately during the chip design. Furthermore, post-fabrication revision regarding performance/power is not supported.
Accordingly, there exists a need for using the processing times of stages based on post-chip fabrication measurements rather than the less accurate estimates calculated during a chip design.