With the progress towards finer design rules in semiconductor process, the number of function blocks (which are also called cores) that are integrated into a semiconductor integrated circuit is on the increase. Further, the operating frequency of each function block is becoming increasingly higher. However, in such a semiconductor integrated circuit, it has become difficult to perform communication between function blocks with a frequency which is equal to the operating frequency of each function block due to a higher-speed clock signal, greater fluctuations and the like. This results in the degradation of communication capability between function blocks compared to data processing capability of each function block.
An approach to address the above issue is a pipeline technique to speed up the operating frequency of a communication circuit (e.g. on-chip bus, on-chip interconnect, on-chip network etc.) between function blocks (see Patent Literature 1).
FIG. 12 is a view showing a configuration of a communication circuit between function blocks by the pipeline technique. In FIG. 12, a communication circuit 100 is a communication circuit that transfers data that is output from a function block A to a function block B. To be more specific, data that is output from the function block A to a signal 113 is input to the communication circuit 100 near the function block A, and the communication circuit 100 transfers the data to near the function block B, and outputs the data to a signal 114. The data that is output to the signal 114 is then input to the function block B. The signal 113 and the signal 114 are signals including a plurality of bits and having the same data width.
All of the function block A, the function block B and the communication circuit 100 operate with a clock F, which is the same high-speed clock. The communication circuit 100 is composed of four stages of pipeline circuits in order to achieve the same operating frequency as the function block A and the function block B. Specifically, in the communication circuit 100, a signal to transfer data is divided into four sub-signals 112a, 112b, 112c and 112d by pipeline registers 110a, 110b, 110c, 110d, 110e. Each of the sub-signals 112a, 112b, 112c and 112d is a signal including a plurality of bits and having the same data width as the signal 113 and the signal 114.
Further, a plurality of buffer circuits 111a, 111b, 111c and 111d for driving each sub-signal are respectively inserted into the sub-signals 112a, 112b, 112c and 112d. Although not shown in FIG. 12, in addition to the buffer circuits, selector circuits, switch circuits and the like for switching a communication path may be included as appropriate.
Next, an example of the operation of the communication circuit 100 is described with reference to FIG. 13. FIG. 13 is an explanatory view showing the timing of data transfer from the function block A to the function block B.
In FIG. 13, data D0 that is output from the function block A to the signal 113 at the timing T0 is input to the communication circuit 100 at the timing T1. Specifically, at the timing T1, the pipeline register 110a latches the data D0 and outputs it to the sub-signal 112a. Likewise, at the timing T2, the pipeline register 110b latches the data D0 and outputs it to the sub-signal 112b. Subsequently, at the timing T3 to T5, the data D0 is sequentially output to the signals 112c, 112d and 114 via the pipeline registers 110c, 110d and 110e in the same manner. Consequently, the data D0 arrives at the function block B at the timing T5.
Likewise, data D1 that is output from the function block A to the signal 113 at the timing T1 is input to the communication circuit 100 at the timing T2. Specifically, at the timing T2, the pipeline register 110a latches the data D1 and outputs it to the sub-signal 112a. Likewise, at the timing T3, the pipeline register 110b latched the data D1 and outputs it to the sub-signal 112b. Subsequently, at the timing T4 to T6, the data D1 is sequentially output to the signals 112c, 112d and 114 via the pipeline registers 110c, 110d and 110e in the same manner. Consequently, the data D1 arrives at the function block B at the timing T6.
After that, in the same manner, data D2 to D7 that are output from the function block A at the timing T2 to T7 arrive at the function block B at the timing T7 to T12. In the example of FIG. 13, it takes five cycles of the clock F until the data output from the function block A arrives at the function block B (for example, the data D0 is output from the function block A at the timing T0 and arrives at the function block B at the timing T5). Thus, the latency of data transfer from the function block A to the function block B is five cycles of the clock F.
On the other hand, the communication circuit 100 is composed of four stages of pipeline circuits, and data transfer is implemented by pipeline. Therefore, data transfer can be done in each cycle of the clock F despite that the latency is five cycles. In other words, the throughput of data transfer from the function block A to the function block B is 1 (which indicates transferring one data per cycle of the clock F).
Further, another means to improve the communication performance of a communication circuit is a method of expanding the data width of the communication circuit (see Patent Literature 2). An example is a method that expands the data width of the communication circuit to N times (N is a positive integer) the data width of the function block, and makes the communication circuit operate with a clock having a frequency of 1/N. Because N times larger data per cycle can be thereby transferred, communication can be done without a decrease in throughput even if the operating frequency is 1/N.
FIG. 14 is a view showing a configuration of a communication circuit between function blocks by the technique of expanding the data width. In FIG. 14, data that is output from a function block A to a signal 163 is input to a communication circuit 150 near the function block A, and the communication circuit 150 transfers the data to near a function block B, and outputs the data to a signal 164. The data that is output to the signal 164 is then input to the function block B.
The communication circuit 150 includes four signals 162a, 162b, 162c and 162d. Each of the signals 162a, 162b, 162c and 162d is a signal including a plurality of bits and having the same data width as the signal 163 and the signal 164. Thus, the communication circuit 150 as a whole is provided with signals with the data width of four times that of the signal 163 and the signal 164. Further, a plurality of buffer circuits 161a, 161b, 161c and 161d for driving each signal are respectively inserted into the signals 162a, 162b, 162c and 162d. 
The communication circuit 150 further includes input data storage circuits 160a and 160b, an input control circuit 166, an output data storage circuit 165, and an output control circuit 167. The function block A and the function block B operate with a clock F, which is a high-speed clock signal. The input data storage circuits 160a and 160b and the output control circuit 167 of the communication circuit 150 also operate with the clock F.
On the other hand, the input control circuit 166 and the output data storage circuit 165 operate with a clock S, which is a low-speed clock signal. The frequency of the clock S is ¼ the frequency of the clock F.
Next, an example of the operation of the communication circuit 150 is described with reference to FIG. 15. FIG. 15 is an explanatory view showing the timing of data transfer from the function block A to the function block B. In FIG. 15, data D0 that is output from the function block A to the signal 163 at the timing T0 is input to the communication circuit 150 at the timing T1. Specifically, the data D0 is stored into the input data storage circuit 160a at the timing T1. Likewise, in the communication circuit 150, data D1 that is output from the function block A to the signal 163 at the timing T1 is additionally stored into the input data storage circuit 160a at the timing T2. Likewise, in the communication circuit 150, data D2 that is output from the function block A to the signal 163 at the timing T2 is additionally stored into the input data storage circuit 160a at the timing T3. Likewise, in the communication circuit 150, data D3 that is output from the function block A to the signal 163 at the timing T3 is additionally stored into the input data storage circuit 160a at the timing T4. As a result, the data D0, D1, D2 and D3 are stored in the input data storage circuit 160a at the timing T4.
After the four data D0, D1, D2 and D3 are all stored in the input data storage circuit 160a, at the timing T6, which is the timing at the next rising edge of the clock S, the input control circuit 166 outputs the stored data D0, D1, D2 and D3 to the signals 162a, 162b, 162c and 162d, respectively.
Then, at the timing T10, which is the timing at the subsequent rising edge of the clock S, the output data storage circuit 165 stores the data D0, D1, D2 and D3 that are output from the input control circuit 166 at the timing 16.
The output control circuit 167 outputs the data D0, out of the data D0, D1, D2 and D3 stored in the output data storage circuit, to the signal 164 at the timing T10. Consequently, the data D0 arrives at the function block B at the timing T10. Likewise, the output control circuit 167 outputs the data D1 at the timing T11, outputs the data D2 at the timing T12, and outputs the data D3 at the timing T13, to the signal 164. Consequently, the data D1, D2 and D3 arrive at the function block B at the timing T11, T12 and T13, respectively.
Although the operation for communicating the data D0, D1, D2 and D3 is described above, the same applies to data D4, D5, D6 and D7.
In FIG. 15, data D4 that is output from the function block A to the signal 163 at the timing T4 is input to the communication circuit 150 at the timing T5. Specifically, the data D4 is stored into the input data storage circuit 160b at the timing T5. Likewise, in the communication circuit 150, data D5 that is output from the function block A to the signal 163 at the timing T5 is additionally stored into the input data storage circuit 160b at the timing T6. Likewise, in the communication circuit 150, data D6 that is output from the function block A to the signal 163 at the timing T6 is additionally stored into the input data storage circuit 160b at the timing T7. Likewise, in the communication circuit 150, data D7 that is output from the function block A to the signal 163 at the timing T7 is additionally stored into the input data storage circuit 160b at the timing T8.
As a result, the data D4, D5, D6 and D7 are stored in the input data storage circuit 160b at the timing T8.
After the four data D4, D5, D6 and D7 are all stored in the input data storage circuit 160b, at the timing T10, which is the timing at the next rising edge of the clock S, the input control circuit 166 outputs the stored data D4, D5, D6 and D7 to the signals 162a, 162b, 162c and 162d, respectively.
Then, at the timing T14, which is the timing at the subsequent rising edge of the clock S, the output data storage circuit 165 stores the data D4, D5, D6 and D7 that are output from the input control circuit 166 at the timing T10.
The output control circuit 167 outputs the data D4, out of the data D4, D5, D6 and D7 stored in the output data storage circuit, to the signal 164 at the timing T14. Consequently, the data D4 arrives at the function block B at the timing T14.
Likewise, the output control circuit 167 outputs the data D5 at the timing T15, outputs the data D6 at the timing T16, and outputs the data D7 at the timing T17, to the signal 164. Consequently, the data D5, D6 and D7 arrive at the function block B at the timing T15, T16 and T17.
As described above, because the data width of the communication circuit 150 is four times the data width of the signal 163 that is output from the function block A, four times larger data per cycle can be transferred. Stated differently, provided that data can be transferred in each cycle of the clock S, even when the frequency of the clock S is ¼ the frequency of the clock F, the throughput of data transfer from the function block A to the function block B can be 1 (which indicates transferring one data per cycle of the clock F).
To be more generalized, the communication circuit shown in FIG. 14 requires N times more signal lines than the communication circuit shown in FIG. 12 because the data width of the communication circuit is set to N times (N is a positive integer) the data width of the function block. On the other hand, the communication circuit shown in FIG. 14 has an advantage that the pipeline registers are not necessary, and the frequency is low, so that power consumption is small.