1. Field of the Invention
Apparatuses and methods consistent with the present invention relate to data processing systems and methods, and more particularly, to data processing systems and methods capable of dynamically controlling the number of rotating register files for a software pipelined loop.
2. Description of the Related Art
In general, a loop program having a certain number of iteration times can be completed in fewer cycles when different iterations are performed in parallel with software pipelining applied than when sequentially performed one iteration after another.
However, when the software pipelining method is applied, the lifetime of an identical variable can be overlapped in different iterations, which causes a problem of conflict of registers in use. For example, as shown in FIG. 1A, if a value created by OP1 is used by OP2, the value can be communicated by a register r13. However, the lifetime of the value stored in the register r13 is overlapped in the nth and (n+1)th iterations of a loop. Accordingly, the value created by OP1 of the (n+1)th iteration is newly stored in the register r13 before OP2 of the nth iteration uses the value created by OP1 of the nth iteration, which causes a problem since the OP2 of the nth iteration uses an incorrect value.
In order to solve such a problem, renaming a register is needed. The register renaming methods include the ‘Modulo Variable Expansion (MVE)’ method supporting register renaming in a software manner and methods supporting register renaming in a hardware renaming by using a rotating register file.
FIG. 1B is a view for showing the use of a rotating register file. In FIG. 1B, the sum of a logical register number defined in an instruction and a value (RRB: Rotation Register Base) corresponding to the number of current iteration times stored in a base register is used as a new register number. Here, the RRB value is incremented or decremented by one for every iteration in the wrap around manner.
For example, in FIG. 1B, if the RRB is 7 in the nth iteration, the RRB in the (n+1)th iteration becomes 8. Therefore, the logical register r13 becomes a physical register r20 in the nth iteration, and the logical register r13 becomes a physical register r21 in the (n+1)th iteration. Thus, the value created by OP1 of the nth iteration and the value created by OP1 of the (n+1)th iteration are written in different physical registers r20 and r21, respectively, and thus the above-mentioned problem can be solved.
On the other hand, conventionally, the number of static registers and rotating registers forming a register file is designed fixed and unchanged in the hardware manner. However, since the number of needed static registers is different from the number of the rotating registers for every program loop, it can occur that registers necessary while looping become insufficient. In this case, the system performance is degraded since a spill/fill code is generated to temporarily move the values stored in registers to a memory and then read the values again into the registers.
Specifically, in the coarse-grain loop accelerator 40 shown in FIG. 2, not all the data processing cells contain a load/store unit that loads data from the memory 45 or stores data in the memory 45. Accordingly, when a spill/fill code is generated in a distributed register file (RF) of a data processing cell 41 that does not contain a load/store unit, there exists a problem of severely degrading the performance of the accelerator 40 since the data processing cell 41 loads or stores data from or into the memory 45 through a data processing cell 42 equipped with a load/store unit.