1. Field of the Invention
The present invention relates to a processor which can execute a loop program having a loop instruction including a repetitive access to data stored in a memory, and a loop program control device which can execute a control to make a plurality of processors execute, in parallel, respective loops of a loop process of a loop instruction. Further, the present invention is concerned with a multiprocessor system which includes a plurality of processors and a loop program control device as described above.
Recently, it has been required to execute high-speed, high-performance processing in a computer. Such a requirement can be achieved by parallel processing of instructions or the like.
A typical method related to the parallel process of instructions executed by processors is a multiprocessor system capable of executing programs by a plurality of processors in a parallel fashion.
2. Description of the Related Art
A description will be given of a conventional single processor system which executes a loop program including a loop instruction including a repetitive access to data stored in a memory, and a conventional multiprocessor system.
As an example of the loop program including the loop instruction, a description will be given of a case where the following program is executed by a conventional single processor system or a conventional multiprocessor system.
The above loop program is a program that includes a loop instruction and repetitive access to data stored in a memory.
The loop program has an initial setting in which an immediate value of 00h (h denotes hexadecimal notation) written into an AR0 register (instruction #1), and an immediate value of 80h is written into an AR1 register (instruction #2).
Instruction #3 loads data to an R0 register from address 00h indicated by the AR0 register, and increments AR0 after the loading. That is, 04h is written into the AR0 register. The data consists of 32 bits. Instruction #4 loads data to an R1 register from address 04h indicated by the AR0 register, and increments AR0 after the loading. That is, 08h is stored in the AR0 register. The data consists of 32 bits.
Instruction #5 stores the result of an adding operation on R0 and R1 in the R1 register.
Instruction #6 stores data in the R1 register in a memory area indicated by address 80h stored in the AR1 register, and then updates the address by incrementing it. That is, the incremented address in the register AR1 becomes 84h. The data consists of 32 bits.
Instruction #7 jumps execution of instructions to labell and causes instructions #3-#7 to be repeatedly executed until a variable num becomes equal to 4 (num=4). The variable num has an initial value of 0, and is incremented each time the process is jumped by the loop instruction LOOP.
The above loop program including the loop instruction is executed by the conventional single processor or the conventional multiprocessor system as follows. A data memory space for the loop program is configured as shown in FIG. 1. More particularly, the data memory space includes a read (load) data area related to the zeroth-loop execution of the loop process to the fourth-loop execution thereof, and a write (store) data area. The read data area is accessed by data addresses 0000h-0024h, and the write data area is accessed by data addresses 0080h-0093h.
The loop instruction LOOP is executed by the single processor, as shown in FIG. 2. The single processor system time-serially executes respective loops of the loop process four times (the zeroth-loop execution to the fourth-loop execution). The single processor accesses the memory space shown in FIG. 1 each time a loop of the loop process is executed.
The loop instruction LOOP can also be executed by the multiprocessor system, as shown in FIG. 3. As shown in FIG. 3, the loop process of the loop instruction LOOP is separated into the respective loop processes by a compiler, and the processors execute the respective loops in parallel. In this case, the loops executed by the respective processors are assigned to areas of an instruction memory that are accessible by the processors at the time of compiling. For example, in FIG. 3, processor (0) is involved with the zeroth-loop execution of the loop process, and processor (1) is involved with the first-loop execution thereof. Similarly, processor (2) is involved with the second-loop execution of the loop process, and processor (3) is involved with the third-loop execution thereof. Further, processor (4) is involved with the fourth-loop execution of the loop instruction. The process of the loop instruction is separated into the respective loops by the compiler, and the respective loops are assigned to the processors. Hence, it is not necessary to serially execute the loop processors. Thus, the branch instruction LOOP is not needed.
The conventional multiprocessor system has high performance when the processors respectively execute different programs. However, the conventional multiprocessor system does not have high performance when a single program is segmented and executed.
More particularly, the conventional multiprocessor system employs a scheduling method in order to process the program including the loop instruction in parallel. At the time of compiling, the loop process of the loop instruction of the program is separated into the respective loops, and the processors are respectively scheduled to execute the loops. In other words, the processors are scheduled to be assigned to the respective accessible instruction memory areas. Hence, the conventional multiprocessor system is required to store the program for each of the loops and thus has a huge memory area. This increases the cost in practice.
It is a general object of the present invention to eliminate the above disadvantages.
A more specific object of the present invention is to provide a multiprocessor system capable of executing a loop process in a program in parallel by processors without an increased memory area.
Another object of the present invention is to provide a processor and a loop program control device applicable to the above-mentioned multiprocessor system.
The above objects of the present invention are achieved by a processor which can execute a loop program including a loop instruction, the processor comprising: an addressing unit which generates a data address with which data can be read from a memory during execution of the loop instruction, the above data address including information indicative of which loop of a loop process defined by the loop instruction should be executed; the information forming part of the data address. As has been described previously, the prior art multiprocessor system separates a loop process of a loop instruction into respective loops, which are then stored in a memory. Hence, the prior art multiprocessor system needs an extremely large memory space. In contrast, the present invention makes it possible for the processor to recognize which loop of the loop process should be executed. Hence, it is no longer required to separate the loop process of the loop instruction into the respective loop processes defined thereby. The present invention loads the loop program stored in an instruction memory and recognizes which loop of the loop process should be executed. In this case, data to be processed can be obtained by the data address including the information indicative of which loop of the loop process should be executed.
The above objects of the present invention are also achieved by a processor which can execute a loop program including a loop instruction, the processor comprising: an addressing unit which generates a data address with which data can be read from a memory during execution of the loop instruction, the above data address including information indicative of which loop of a loop process defined by the loop instruction should be executed, the information forming part of the data address; and an increment unit which automatically updates the information after the loop is executed; the updated information forming part of the data address so that a next data address can be generated.
The processor may be configured so that: the updated information indicates a number of times loops of the loop process that have been executed; the processor further comprises a comparator unit which determines whether the number of loops indicated by the updated information exceeds a given number of loops; and the loop process continues to be executed until the number of loops indicated by the updated information exceeds the given number of loops.
The above objects of the present invention are also achieved by a loop program control device adapted to a multiprocessor system having a master processor and slave processors, the loop program control device comprising: a leading address detection unit which detects a leading address of a loop program when the master processor executes the loop program; a detection unit which detects a total number of loops of a loop process defined by a loop instruction included in the loop program that should be executed; a first notification unit which notifies the processors of the leading address detected by the leading address detection unit; and a second notification unit which notifies each of the processors of information indicating which loop of the loop process should be executed. Hence, it is possible to recognize the number of processors required to execute the parallel processing of the loop program. Each of the processors thus recognized is notified of which one of the loops should be executed, namely, which times of the loop process should be executed. Hence, the parallel processing can easily be realized.
The above loop program control device may further comprise: a snooping unit which monitors whether the master and slave processors can execute the loop instruction in parallel; and a loop count unit which counts up or down, each time the second notification unit notifies one of the processors of the information, a count value which is related to a number of loops of the loop process that have been executed. Hence, it is possible to easily identify processors which can be involved with the parallel processing of the loop program.
The above objects of the present invention are also achieved by a multiprocessor system comprising: processors capable of executing loops of a loop process defined by a loop instruction included in a loop program, each of the processors comprising an addressing unit which generates a data address with which data can be read from a memory during execution of the loop instruction, the above data address including information indicative of which loop of the loop process defined by the loop instruction should be executed; and a loop program control device which controls the same number of processors as a number of loops of the loop process should repeatedly be executed so that the same number of processors executes the respective loops in parallel. As has been described previously, the prior art employs a compiler which separates the loop process of the loop program into respective loops, to which processors are respectively assigned by the scheduling method. Hence, the processors respectively use pre-computed and fixed data addresses. In contrast, the present invention employs the information indicative of which loop of the loop process should be executed by the respective processor. Hence, it is no longer required to separate the loop process of the loop program into the respective loops by a compiler. Hence, it is possible for the processors to access the same loop program stored in a memory.
The above multiprocessor system may be configured so that the loop program control device comprises: a leading address detection unit which detects a leading address of the loop program when one of the processors serving as a master processor executes the loop program; a detection unit which detects a total number of loops of the loop process that should be executed; a first notification unit which notifies the processors including processors serving as slave processors of the leading address detected by the leading address detection unit; and a second notification unit which notifies each of the processors of information indicating which loop of the loop process should be executed. By recognizing the total number of loops of the loop process which should be carried out, it is possible to identify the number of processors which should be involved with the parallel processing of the loop program or instruction. Then, the processors thus determined are supplied with the leading address of the loop program. Further, each of the processors is notified of which loop of the loop process should be handled. Hence, the multiprocessor system can realize parallel processing of the loop program without increasing the memory space.
The above multiprocessor system may be configured so that the loop program control device further comprises: a snooping unit which monitors whether the master and slave processors can execute the loop instruction in parallel; and a loop count unit which counts up or down, each time the second notification unit notifies one of the processors of the information, a count value which is related to a number of loops of the loop process that have been executed.
The multiprocessor system may be configured so that each of the processors comprises an addressing unit which generates a data address with which data can be read from a memory during execution of the loop instruction, the above data address including information indicative of which loop of the loop process defined by the loop instruction should be executed.
The multiprocessor system may be configured so that each of the processors comprises: an addressing unit which generates a data address with which data can be read from a memory during execution of the loop instruction, the above data address including information indicative of which loop of the loop process defined by the loop instruction should be executed; and an increment unit which automatically updates the information after the loop is executed, the updated information forming part of the data address so that a next data address can be generated.
The multiprocessor system may be configured so that: the updated information indicates the number of loops of the loop process that have been executed; the processor further comprises a comparator unit which determines whether the number of loops indicated by the updated information exceeds a given number of loops; and the loop process continues to be executed until the number of loops indicated by the updated information exceeds the given number of loops.
The multiprocessor system may further comprise: a buffer having a memory space accessed by the processors when the processors execute the loop instruction in parallel; and a snooping which monitors which loop of the loop process uses data input to the buffer from the memory and monitors which loop of the loop process is being executed for each of the processors, one of the processors which is executing the loop of the loop process which uses the data in the buffer being assigned a right to access the buffer.
The multiprocessor system may further comprise: a buffer having a memory space accessed by the processors when the processors execute the loop instruction in parallel; and a snooping unit which monitors which loop of the loop process is being executed for each of the processors, a right to access the buffer being serially given to the processors in the increasing order of loop numbers of the loops of the loop process which are being executed when the processors commonly use the data stored in the buffer and generate an identical data address.
The multiprocessor system may be configured so that when a specific one of the processors recognizes the loop instruction while executing the loop program, the processors including the above specific one of the processors execute the loops of the loop process defined by the loop instruction.
The above objects of the present invention are also achieved by a processor which can execute a loop program including a loop instruction, comprising: an addressing unit which generates a data address with which data can be read from a memory during execution of the loop instruction, the above data address including information indicative of which loops of a loop process defined by the loop instruction should be executed, the information forming part of the data address; and a decrement unit which automatically decrements a loop number of the above-mentioned loop after the loop is executed, the decrement unit adding information indicative of the decremented loop number to the data address so that a next data address can automatically be generated.