1. Field of the Invention
The present invention relates to a parallel processor which simultaneously executes a plurality of instructions in one clock cycle.
2. Description of the Related Art
For example, a parallel processor represented by a super scalar processor and VLIW (very long instruction word) processor simultaneously executes a plurality of instructions in one clock cycle. In order to supply information (data) used in the plurality of instructions to an operational element and to receive data from the operational element, a register file that temporarily stores a large number of data items is required in the parallel processor.
For example, in order to execute a single instruction such as addition (a+b=c), subtraction (a−b=c), or the like, three operands a, b, and c are required. Therefore, the register file, a multi-port register file having three ports (input/output terminals) is required per instruction.
For example, in a parallel processor that simultaneously executes four instructions, the multi-port register file must have 12 ports. Furthermore, in a parallel processor that simultaneously executes eight instructions, a multi-port register file having 24 ports is required.
In general, the required packaging area of a register file having a given storage size increases in proportion to the square of the number of ports (input/output terminals). Therefore, since the parallel processor that includes this multi-port register file becomes bulky as a whole, not only the manufacturing cost rises, but also the wiring length increases, thus causing deterioration in characteristics such as an operation speed drop of the processor. For this reason, the number of instructions to be executed simultaneously cannot be easily increased.
To solve such a problem, the following measures (a) and (b) have been examined for a register file to be incorporated in a parallel processor.
(a) The number of ports is increased by adopting a plurality of copies of a register file.
(b) A multi-bank register file is adopted.
However, if measure (a) is taken, the number of ports can be increased, but problems such as an increase in packaging area remain unsolved.
In contrast, if measure (b) is taken, the number of ports per bank to/from which data is input/output can be fixed to a minimum. Furthermore, since respective data can be stored in different registers by switching banks, the packaging area can be greatly reduced compared to a conventional multi-port register file.
However, since a parallel processor simultaneously executes a plurality of instructions, instructions to be executed simultaneously are more likely to access registers that belong to an identical bank, and access delay may increase due to bank access contention. Therefore, the number of instructions to be executed simultaneously cannot be easily increased. In order to reduce the frequency of occurrence of bank access contention, the number of banks can be increased. However, if the number of banks is increased, the required storage size of the whole multi-bank register file increases unwantedly.