In a processor, an instruction included in a program is stored in an instruction memory. If the program has a repetitive structure for repeatedly using one or a plurality of instructions, a conditional branch instruction can be used to create a loop program to reduce the capacity of the instruction memory. However, an overhead for returning from a loop end to a loop start (or out of the loop) is generated when the conditional branch instruction is used.
In particular, in a pipeline processor, an instruction in a pipeline that is already fetched is invalidated when the conditional branch instruction is used. An instruction of a branch destination is newly fetched, and the overhead becomes prominent.
Therefore, the processor generally has a function of using a loop instruction to set a loop start address, a loop end address, and loop count to a register, and the hardware manages and executes the loop processing without the overhead. The function is called a hardware loop function, a zero overhead loop function, a zero delay loop function, a loop instruction function, or the like. In the specification, the functions will be collectively called a hardware loop function unless otherwise specifically stated.
FIG. 1 shows a configuration of a related processor with the hardware loop function. Details of the related art are disclosed in Patent Literature 1 and Patent Literature 2.
As shown in FIG. 1, processor 1002 of the related art includes program counter 100, instruction memory 200, instruction decoder 300, calculator 400, data memory 500, loop counter 600, and loop controller 700.
Program counter 100 notifies instruction memory 200 and loop controller 700 of an instruction address of an instruction to be issued. Usually, program counter 100 sequentially increments the instruction address and notifies the incremented instruction address. However, if calculator 400 or loop controller 700 sets an instruction address jump destination described later, program counter 100 transfers the instruction address of the set jump destination.
Instruction memory 200 fetches an instruction according to the instruction address notified by program counter 100 and issues the fetched instruction to instruction decoder 300.
Instruction decoder 300 decodes the instruction issued by instruction memory 200 and notifies calculator 400 of the decoded instruction as a calculation control signal. If a loop instruction is issued, instruction decoder 300 sets the loop start address and the loop end address to loop controller 700.
Calculator 400 performs various calculations according to the calculation control signal notified by instruction decoder 300. Calculator 400 loads data necessary for the calculations from data memory 500 to store the data in a register file included inside and uses the data to perform the calculations. Calculator 400 can also store a calculation result in data memory 500. If the calculation control signal obtained by decoding the loop instruction is notified, calculator 400 sets the loop count to loop counter 600. If the calculation control signal obtained by decoding the conditional branch instruction is notified, calculator 400 can branch the program progress by setting the instruction address jump destination to program counter 100 when a state (for example, a register file value or a data transfer completion notification signal notified by DMAC (Data Memory Access Controller) 3000) coincides with a condition defined by the calculation control signal.
Data memory 500 stores data from calculator 400 and loads data on calculator 400. Data can be transferred between data memory 500 and external memory 4000 outside of the processor through external bus 2000. DMAC 3000 manages the data transfer. DMAC 3000 manages the data transfer based on DMAC setting input from an external device including processor 1002.
If the instruction address notified by program counter 100 coincides with the loop start address set by instruction decoder 300, loop controller 700 notifies loop counter 600 of a decrement signal.
Loop counter 600 handles the loop count set by calculator 400 as an initial value of the loop count value. Every time the decrement signal is notified by loop controller 700, loop counter 600 decrements the loop counter value by 1 and notifies loop controller 700 of the decremented loop count value.
If the instruction address notified by program counter 100 coincides with the loop end address set by instruction decoder 300 and if the loop count value notified by loop counter 600 is not 0, loop controller 700 notifies program counter 100 of the loop start address as the instruction address jump destination.
If the instruction address notified by program counter 100 coincides with the loop end address set by instruction decoder 300 and if the loop count value notified by loop counter 600 is 0, loop controller 700 notifies program counter 100 of an instruction address following the loop end address as the instruction address jump destination.
In this way, according to the related art, the loop count needs to be identified before the loop instruction is issued, and calculator 400 needs to hold the information of the loop count.
However, in some applications, the loop count depends on an amount of data that is used for executing a process in the loop, and information about the amount of data that is used for executing a process in the loop is transferred after the completion of the transfer of the data that is used for executing a process in the loop from external memory 4000 to data memory 500.
In such an application, loop processing needs to be started after the data that is used for executing a process in the loop and the information about the amount of data that is used for executing a process in the loop are transferred to the data memory 500. This leads to an increase in the capacity of data memory 500 and an increase in the delay of the process.
FIG. 2 shows a processing flow when the information of the amount of data that is used for executing a process in the loop is transferred after the data that is used for executing a process in the loop is transferred in processor 1002 of the related art shown in FIG. 1.
As shown in FIG. 2, external memory 4000 transfers the data that is used for executing a process in the loop to data memory 500 (step S101), and then the information of the amount of data that is used for executing a process in the loop is transferred (step S102).
After checking the completion of the transfer of the information of the amount of data that is used for executing a process in the loop, calculator 400 calculates the loop count based on the amount of data that is used for executing a process in the loop (step S103). The completion of the transfer can be checked by the data transfer completion notification signal notified by DMAC 3000.
Subsequently, if the loop instruction is issued (Yes in step S104), the loop count calculated by calculator 400 is set to loop counter 600 as the initial value of the loop count value. The loop start address and the loop end address are set to loop controller 700 (step S105).
After the issue of the loop instruction, the program proceeds until the instruction address coincides with the loop start address.
If the instruction address coincides with the loop start address (Yes in step S106), loop controller 700 notifies loop counter 600 of the decrement signal, and loop counter 600 reduces the loop count value by 1 (step S107). Subsequently, calculator 400 advances the process in the loop until the instruction address reaches the loop end address (step S108).
If the instruction address coincides with the loop end address (Yes in step S109), loop controller 700 executes a loop completion determination process.
If the loop count value is not 0 (No in step S110), loop controller 700 determines that the loop processing is not completed and notifies program counter 100 of the loop start address as the instruction address jump destination (step S111).
On the other hand, if the loop count value is 0 (Yes in step S110), loop controller 700 determines that the loop processing is completed and notifies program counter 100 of the instruction address that follows the loop end address as the instruction address jump destination (step S112).
In this way, the process in the loop can be executed based on the set loop count.