Parallel processing of computational instructions has been used as a technique for boosting the performance of processors such as a central processing unit (CPU). A plurality of execution units integrated in a processor execute computational operations concurrently. Some of those computing devices employ, for example, reservation stations to make more efficient use of execution units. With this feature, a plurality of decoded instructions are distributed to relevant reservation stations associated with a plurality of execution units, so that the instructions are executed out of order under the control of those reservation stations.
A higher performance is achieved as the parallelism of computation is enhanced by integrating more execution units in the same processor. More execution units means more ports and wire lines for those execution units to exchange data with a register file. The consequent increase in the wiring space leads to longer wire lengths and larger propagation delays of signals between the execution units and register file, thus degrading performance of the processor.
Several techniques are proposed to reduce the increase of write paths to the register file. For example, one proposed technique determines, based on the decoding result of an instruction, to which execution unit the instruction is to be subjected and in which register the computational result is to be written. See, for example, the following patent literature:
Japanese Laid-open Patent Publication No. 2004-38751
Japanese Laid-open Patent Publication No. 10-91442
Generally, the execution units such as adders, subtractors, and logical operators perform a specific computation on two source values and output one computational result. For this purpose, there are two read paths from the register file to each execution unit, and one write path from each execution unit to the register file. This means that two read paths are routed each time one execution unit is added. In other words, the problem of increased wiring space and propagation delays is more prominent in the read paths than in the write paths.