Vector processing circuits have been familiar as a circuit for performing computation processing of data. The vector processing circuits have generally been employed in supercomputers, and handle array-type data stored in a vector register file.
The vector processing circuits include multiple pipeline arithmetic units for performing desired computation processing according to a command, and perform multi-cycle operation at the pipeline arithmetic units. That is to say, the vector processing circuits process the array data to be processed for every multiple partial data across multiple cycles, and occupy the pipeline arithmetic units across multiple cycles until processing as to the entire array data is completed.
The size of array data, i.e., the number of array elements is specified by vector length (VL). The number of array elements specified by the vector length makes up one array register. The vector length is specified by a vector length register provided to within the vector processing circuit. The size of each array element is assigned according to data word length that the vector processing circuit handles. The data word length is specified by a command.
FIG. 1 is a diagram illustrating a configuration example of a vector register file for super computer. The vector register file 101 illustrated in FIG. 1 has a structure of one double word (doubleword)×256 entries, and the vector length is 16 (VL=16).
With the example illustrated in FIG. 1, one array element 102 stores 64-bit data. One array register 103 is made up of 16 array elements 102, and the vector register file 101 is made up of 16 array registers 103. Each of the array registers 103 stores array data 104. A physical number (0 to 255) is assigned to each of the array elements, and a logical number (0 to 15) is assigned to each of the array registers. Access to the vector register file 101 is performed by specifying the logical number of an array register, but further performed by generating the physical number of an array element based on the specified logical number and vector length.
Also, with the configuration example for super computer in FIG. 1, the sizes of array elements are assigned with the maximum data word length (e.g., 64 bits) that the vector processing circuit handles as a basic unit. In the event that the length of the data word length is a half (e.g., 32 bits), a method has been employed wherein the first half 32 bits alone are used at each of the array elements, and the second half 32 bits are unused. As illustrated in FIG. 1, for example, in the event of the vector processing circuit handling a double precision processing command, the data word length is 64 bits, and all of the 64 bits are used at each of the array elements, for example. On the other hand, in the event of handling a single precision processing command, the data word length is 32 bits, and accordingly, the first half 32 bits alone are used at each of the array elements.
Incidentally, with the vector processing circuits, at the time of command issuance, determination is made whether or not there is register interference between the preceding command and the subsequent command. This is because in the event that the array register specified by the preceding command, and the array register specified by the subsequent command overlap, in order to suitably reflect the processing results (array data) in the preceding command on the processing in the subsequent command, the issuance timing of the mutual commands has to be suitably adjusted.
Usually, array data to be processed at each of the pipeline arithmetic units may be distinguished the physical number (or logical number) alone of the head array element making up the corresponding array register. This is because the array data is made up of the same number of array elements, and multiple array elements making up the corresponding array register are processed as a single unit as to one command. For example, with the vector register file 101 illustrated in FIG. 1, distinction of the array data of each of the array registers (logical number 0 to 15) may be performed by the physical number (0, 16, 32, . . . , 240) of the head array element of each of the array registers.
Therefore, when determining whether or not there is register interference between the preceding command and the subsequent command, it has been common to compare the physical number of the head array element of the array register specified by the preceding command, and the physical number of the head array element of the array register specified by the subsequent command.
Also, in the event that register interference has been detected between the preceding command and the subsequent command, a technique has been familiar wherein after delaying by a certain cycle until the processing results in the preceding command is written in the register file, the subsequent command is executed.
For example, refer to Japanese Laid-open Patent Publication No. 06-110686, Japanese Laid-open Patent Publication No. 10-124313, and Japanese Examined Patent Application Publication No. 07-086838.
Heretofore, the vector processing circuits have generally been used with a high-performance computing field such as super computers, but in recent years, it has been studied to apply the vector processing circuits to a signal processing system application field such as wireless baseband processing. In this case, the vector processing circuits are assumed to be used at a processor for a built-in device such as a DSP (Digital Signal Processor).
Also, with a field such as the above-mentioned wireless baseband processing, a case where commands having a different data word length are used by being mixed in a program frequently occurs, for example, such as a half word (Halfword) command of which the data word length is 16 bits, and a word (Word) command of which the data word length is 32 bits.
Even when the data word length differs between the preceding command and the subsequent command, determination has to be made regarding whether or not there is register interference. Therefore, it may be conceived to use the configuration example of the vector register file 101 for super computer illustrated in FIG. 1. In this case, event when commands having a different data word length are mixed, access to the vector register file is constantly performed with array registers having the same size as an access unit, and accordingly, determination regarding whether or not there is register interference may be executed by comparing the physical numbers alone of the head array elements of the array registers specified by commands regardless of difference in data word length.
However, in the event of executing a command of which the data word length is a half size (half word command), the second half portion of each of the array elements are not used, which substantially prevents a half region of the vector register file 101 from being used, and results in significant waste regarding used of the registers. This causes a major problem with a processor for a built-in device on which a register with limited capacity alone is mounted, and it is difficult to sufficiently expand the entire capacity of the vector register file.
On the other hand, in the event that commands having a different data word length are mixed, it may also be conceived to prevent a risk where register interference occurs by delaying issuance of the subsequent command so as to stall the subsequent command until the processing of the preceding command is completed without condition regardless of whether or not there is register interference.
However, in this case, pipeline processing is substantially not executed in parallel at the pipeline arithmetic units, and accordingly, it is unable to take advantage of the vector processing circuit, and efficiency in command execution deteriorates. This causes a major problem with a processor for a built-in device on which an arithmetic unit with limited processing capability alone is mounted, and it is difficult to sufficiently expand the processing capability of the pipeline arithmetic unit.