A software-programmable semiconductor device includes a calculation system using a reconfigurable circuit such as an FPGA (Field Programmable Gate Array). The calculation system using the reconfigurable circuit stores circuit information of the reconfigurable circuit to a storage device such as a memory in advance, and reads necessary information during system initialization, thereby making an execution circuit. A calculation system using a dynamic reconfigurable circuit technique has multiple pieces of such circuit information, and reads the necessary circuit information during operation in accordance with a rule defined by a program, thereby making an execution circuit.
In general, the FPGA is a device in which a basic operation element is an LUT (Look-Up Table) handling fine processing data (small number of bits), and the dynamic reconfigurable circuit is a device in which a basic operation element is an operation device handling coarse processing data (large number of bits). These basic operation elements are connected to each other via programmable wires and switching devices, so that various kinds of arithmetic processings can be executed on a single semiconductor device. Accordingly, it is expected to reduce the development cost of dedicated hardware.
The specification of Japanese Patent No. 3528922 describes an array-type processor as a calculation system in which hardware configuration can be changed by software. This array-type processor has such a configuration that many small scale processor elements are arranged in a matrix form, and the hardware configuration can be altered by altering a program. In other words, in this array-type processor, one instruction code is selected for each processor element in accordance with an order defined by the program. Connection relationships between processor elements and arithmetic processings of operation devices are controlled in accordance with the instruction code. As a result, many operation devices can execute complicated arithmetic processings in parallel.
In general, when complicated arithmetic processings are achieved with processor units in synchronization with a clock, operation devices execute simple operation codes over multiple clock cycles, and complicated arithmetic processings are achieved with a combination of simple calculations. An operation device of a processor unit such as a CPU can usually execute only one operation code in a single clock cycle, and therefore, the operation device consumes multiple clock cycles in accordance with, the complexity of the arithmetic processing to be performed. Therefore, when a processor unit such as a CPU executes complicated arithmetic processings at a high speed, the processor has to operate at a high clock frequency, which causes a problem in that the power consumption increases.
A latency of an operation device executing operation supported by the processor unit (delay in the circuit) varies in accordance with the types of operation codes. Accordingly, the circuit operates, using a clock frequency suitable for the most complicated operation circuit (i.e., having the largest latency). In this case, when a simple operation code is executed, the arithmetic processing can only be executed a fewer times than the number of times the operation device can originally execute the arithmetic processing. Accordingly, the original performance cannot be achieved. Therefore, the processor unit operating at a high clock frequency such as a CPU is designed to divide complicated operation codes into multiple clock cycles and execute the instructions. In this method, however, although the performance can be improved, there is a problem that the power consumption increases due to increase of the clock frequency.
On the other hand, in a reconfigurable circuit such as an FPGA, multiple operation elements are cascade-connected with programmable wires and switch devices, whereby complicated arithmetic processing can be achieved with a single clock cycle. Likewise, in the array-type processor described in Patent Document 1, multiple cascade-connected processor elements arranged in a matrix form are connected, so that complicated arithmetic processings can be programmed. As described above, multiple arithmetic processings can be executed in a single clock cycle using multiple operation elements, and therefore, arithmetic processing can be executed at a high speed even with a low clock frequency.
However, in a case of a semiconductor device for achieving complicated arithmetic processings by using software to program connection relationships between operation elements, the clock frequency relies on programmed arithmetic processing taking the longest execution time.
Even with the same software, the execution time varies depending on mapping of operation codes to operation devices and actual result of wirings between operation elements, and this increases the complexity of the control of the clock frequency for each program.
Further, even though many operation elements can operate in parallel, it is difficult to use the same operation element over multiple clocks, which reduces the efficiency of the use of the circuit.
Therefore, a semiconductor device is required capable of altering hardware configuration with software, wherein the clock frequency at which the semiconductor device operates is ensured, and the semiconductor device can efficiently execute arithmetic processing formed with a combination of multiple operation codes.