1. Field of the Invention
The present invention relates to a processor for detecting a memory transfer routine from instruction sequence—and for processing the memory transfer routine in an execution unit for memory transfer.
2. Description of Related Art
A superscalar processor provided with a plurality of execution units capable of parallel operation and an out-of-order execution feature is widely known in the art. The superscalar processor improves processing speed of a processor by rearranging instructions according to dependencies between the instructions and executing independent instructions in parallel.
One of processes executed by such a processor is a process to move data in a data cache (hereinafter referred to as a memory transfer). Memory transfer is achieved by repeatedly executing load instructions for storing data in a data cache to a register and store instructions for writing data in the register to the data cache. FIGS. 7A and 7B shows examples of instruction sequence of instruction sets in RISC (Reduced Instruction Set Computer) representing a memory transfer.
Lines 1 to 4 in FIG. 7A are load instructions. For example “lw v0, 0(t1)” in line 1 instructs to load data for 1 word (32 bits) from a data cache address (t1+0) to a target register v0, where a storage value of a register t1 being a base address and an address offset value being 0.
Lines 6 to 9 in FIG. 7A are store instructions. For example “sw v0, 0(a3)” instructs to store data for 1 word to a data cache with an address (a3+0), where a storage value of a register a3 being a base address and address offset value being 0.
Lines 10 and 11 in FIG. 7A are add instructions for incrementing the register t1, a base address value indicating a data transfer source, and the register a3, a base address value indicating a data transfer destination, so as to proceed a process to the next loop.
At the end, a loop is created by a subtraction instruction in line 5 and a branch instruction in line 12 of FIG. 7A. An instruction bnez in line 12 is a branch instruction to go back 12 lines if a value of the register t0 is not equal to 0.
As shown in FIG. 7A, a memory transfer instruction is represented by a combination of a plurality of instructions including load instruction, store instruction, add address instruction, and branch instruction. An instruction routine representing a memory transfer can be expressed in various other ways than the routine shown in FIG. 7A. For example FIG. 7B has different number of load and store instructions from FIG. 7A, but it also indicates a memory transfer as in FIG. 7A. A combination of basic instructions to represent a memory transfer generally depends on a compiler for converting source code to assembly language.
As described in the foregoing, a memory transfer process is not suited for improving processing speed by an out-of-order execution because load and store instructions must be sequentially executed. Therefore it has been suggested to perform the memory transfer process in an independent execution unit (See Japanese Unexamined Patent Publication No. 2001-184259 and Japanese Unexamined Patent Publication No. 52-155936, for example).
Note that a conventional technique disclosed in Japanese Unexamined Patent Publication No. 2001-184259 etc. relates to a CISC (Complex Instruction Set Computer) processor. With CISC processor, if a complex instruction is decoded at an instruction decode stage when a memory transfer instruction is defined as one complex instruction such as move instruction, the complex instruction is issued to an execution unit exclusive for memory transfer.
On the other hand in a RISC processor, a memory transfer process is represented by a combination of a plurality of instructions, as shown in FIGS. 7A and 7B. Accordingly RISC processors are incapable of identifying a single memory transfer instruction at an instruction decode stage, as opposed to CISC processors disclosed in Japanese Unexamined Patent Publication No. 2001-184259, for example.
As described so far, we have now discovered that with RISC processor, it is difficult to recognize a memory transfer process and assign a memory transfer process to an execution unit for memory transfer process because the memory transfer process is comprised of a combination of a plurality of instructions.