A typical index for indicating processing performance of digital circuits mainly includes throughput and latency. Throughput indicates an amount that can be processed per unit time. Meanwhile, latency indicates processing time until when a predetermined processing is completed. As a related art, a circuit having a pipeline configuration is known as a circuit configuration that is capable of achieving high operation frequency and high processing throughput as in Japanese Unexamined Patent Application Publication No. 63-201725, titled “signal processing circuit” in Patent literature 1.
FIG. 13 is a block configuration diagram showing a block configuration of a pipeline circuit according to a related art, and shows a pipeline circuit including five-stage pipeline registers 110a, 110b, 110c, 110d, and 110e. In FIG. 13, a pipeline circuit 100 processes data input to a signal 113 by partial circuits 111a, 111b, 111c, and 111d in pipeline, and then outputs the data to a signal 114. Now, the signals 113 and 114 are signals each including a plurality of bits.
More specifically, the pipeline circuit 100 includes four-stage pipeline circuits in order to achieve high operational frequency and high throughput. More specifically, in the pipeline circuit 100, a circuit that performs data processing is divided into the four partial circuits 111a, 111b, 111c, and 111d by the five-stage pipeline registers 110a, 110b, 110c, 110d, and 110e. The five-stage pipeline registers 110a, 110b, 110c, 110d, and 110e all operate by clock F which is a high-speed clock signal.
Referring next to a time chart shown in FIG. 14, an operational example of the pipeline circuit 100 according to the related art shown in FIG. 13 will be described. FIG. 14 is a time chart for explaining a timing relation of data processing by the pipeline circuit 100 shown in FIG. 13.
In the time chart shown in FIG. 14, the pipeline circuit 100 receives, at timing T1, data D0 output from a previous circuit (not shown) to the signal 113 at timing T0. Specifically, at timing T1, the pipeline register 110a latches the data D0, which is then output to the partial circuit 111a. Then, the partial circuit 111a performs processing of the data D0.
Next, at timing T2, the pipeline register 110b latches the data D0 processed by the partial circuit 111a, and outputs the data D0 to the partial circuit 111b. Then, the partial circuit 111b performs the processing of the data D0.
Hereinafter, in the similar way, from timings T3 to T4, the data D0 processed by the partial circuit 111b is processed by the partial circuits 111c and 111d through the pipeline registers 110c and 110d, respectively.
Last, at timing T5, the data D0 that is processed by the partial circuit 111d is output to the signal 114 through the pipeline register 110e. 
In the similar way, the pipeline circuit 100 receives, at timing T2, data D1 output from the previous circuit (not shown) to the signal 113 at timing T1. Specifically, at timing T2, the pipeline register 110a latches the data D1, which is then output to the partial circuit 111a. Then, the partial circuit 111a performs the processing of the data D1.
Next, at timing T3, the pipeline register 110b latches the data D1 processed by the partial circuit 111a, and outputs the data D1 to the partial circuit 111b. Then, the partial circuit 111b performs the processing of the data D1.
Hereinafter, in the similar way, from timings T4 to T5, the data D1 processed by the partial circuit 111b is processed by the partial circuits 111c and 111d through the pipeline registers 110c and 110d, respectively.
Last, at timing T6, the data D1 processed by the partial circuit 111d is output to the signal 114 through the pipeline register 110e. 
Hereinafter, in the similar way, data D2 to D7 output from the previous circuit (not shown) to the signal 113 at timings T2 to T7 are processed by the pipeline circuit 100, and thereafter output to the signal 114 at timings T7 to T12, respectively.
In the example in a time chart shown in FIG. 14, it takes time corresponding to five cycles of clock F from when the data output from the previous circuit is processed by the pipeline circuit 100 to when the data is output to the signal 114 (e.g., at timing T0, the data D0 is input to the pipeline circuit 100 through the signal 113, and at timing T5, the data D0 is output to the signal 114 from the pipeline circuit 100). In summary, the latency of the data processing of the pipeline circuit 100 is five cycles of clock F.
On the other hand, the pipeline circuit 100 includes the four-stage pipeline circuits of partial circuits 111a, 111b, 111c, and 111d, and the data processing is achieved by a pipeline operation. Accordingly, even when the latency is five cycles, the data processing can be performed for every cycle of clock F. In summary, the throughput of the data processing of the pipeline circuit 100 is 1.0 data/cycle (indicating that one piece of data is processed for every cycle of clock F).
In the meantime, also in a circuit having a pipeline configuration, dynamic frequency scaling (DFS) that controls the clock frequency to a sufficient value according to the required throughput is effective to reduce power. However, in the related pipeline circuit, a decrease in the clock frequency causes reduction in throughput according to the decreased amount, and also an increase in latency.
With reference to a time chart shown in FIG. 15, problems of the pipeline circuit according to the related art will be described in detail. FIG. 15 is a time chart for describing a timing relation when the pipeline circuit 100 shown in FIG. 13 is operated with clocks in which the frequency of clock F is reduced by (¼). For the sake of clarification, FIG. 15 shows the clock of the frequency (¼) times as large as that of clock F as clock S. For the sake of comparison, clock F is also shown in addition to clock S.
Even when the clock frequency is reduced from clock F to clock S, the logical operation of the pipeline circuit 100 does not change, and only the timing of the operation is different from a case in which the circuit is operated with clock F.
Specifically, in FIG. 15, the pipeline circuit 100 receives, at timing T4, the data D0 output from the previous circuit (not shown) to the signal 113 at timing T0. More specifically, at timing T4 which is the next rising timing of clock S, the pipeline register 110a latches the data D0, which is then output to the partial circuit 111a. Then, the partial circuit 111a performs the processing of the data D0.
Next, at timing T8 which is the next rising timing of clock S, the pipeline register 110b latches the data D0 processed by the partial circuit 111a, and outputs the data D0 to the partial circuit 111b. Then, the partial circuit 111b performs the processing of the data D0.
Hereinafter, in the similar way, from timings T12 to T16, the data D0 processed by the partial circuit 111b is processed by the partial circuits 111c and 111d through the pipeline registers 110c and 110d, respectively.
Last, at timing T20, the data D0 that is processed by the partial circuit 111d is output to the signal 114 through the pipeline register 110e. 
In the similar way, the pipeline circuit 100 receives, at timing T8, the data D1 output from the previous circuit (not shown) to the signal 113 at timing T4. Specifically, at timing T8 which is the next rising timing of clock S, the pipeline register 110a latches the data D1, which is then output to the partial circuit 111a. Then, the partial circuit 111a performs the processing of the data D1.
Next, at timing T12, the pipeline register 110b latches the data D1 processed by the partial circuit 111a, and then outputs the data D1 to the partial circuit 111b. Then, the partial circuit 111b performs the processing of the data D1.
In the similar way, from timings T16 to T20, the data D1 processed by the partial circuit 111b is processed by the partial circuits 111c and 111d through the pipeline registers 110c and 110d, respectively.
Last, at timing T24, the data D1 processed by the partial circuit 111d is output to the signal 114 through the pipeline register 110e. 
Hereinafter, in the similar way, data D2 to D7 output from the previous circuit (not shown) to the signal 113 at each of timings T8, T12, T16, T20, T24, and T28 are processed by the pipeline circuit 100, and then output to the signal 114 at each of timings T28, T32, T36, T40, T44, and T48 (not all of them are shown).
In the example shown in FIG. 15, it takes time corresponding to 20 cycles of clock F from when the data output from the previous circuit is processed by the pipeline circuit 100 to when the data is output to the signal 114 (e.g., the data D0 is input to the pipeline circuit 100 through the signal 113 at timing T0, and is output to the signal 114 from the pipeline circuit 100 at timing T20). In short, the latency of the data processing in the pipeline circuit 100 is 20 cycles of clock F.
On the other hand, the pipeline circuit 100 includes four-stage pipeline circuits of the partial circuits 111a, 111b, 111c, 111d, and the data processing is achieved by a pipeline operation. Accordingly, even though the latency is five cycles (20 cycles of clock F) of clock S, the data processing can be performed for each cycle of clock S. In summary, since the data processing can be performed for each of four cycles of clock F, the throughput of the data processing of the pipeline circuit 100 is 0.25 data/cycle (indicating that 0.25 data is processed for each cycle of clock F).
This is because the frequency of clock S is (¼) times as large as that of clock F, and thus the cycle time of clock S is four times as large as that of clock F. Thus, the throughput of the pipeline circuit 100 operated by clock S becomes (¼) times as large as that of the pipeline circuit 100 operated by clock F, and the latency of the pipeline circuit 100 operated by clock S becomes four times as large as that of the pipeline circuit 100 operated by clock F.