The present invention relates to the structure and operation of rapid-access memory for an arithmetic logic unit (ALU) for a general purpose or special purpose computer. In particular, the invention relates to the control of a register file that provides temporary storage of operands for access by instructions being executed by the ALU within a particular task or context. The invention is particularly, though not exclusively, suited for use in a special purpose digital signal processor having a Reduced Instruction Set Computer (RISC) architecture.
Each time the ALU executes an instruction, it must generally access an operand stored in memory. In addition, the results of many computations are stored, either temporarily or permanently, in memory. Virtually every arithmetic computation performed by the arithmetic logic unit of the computer requires accessing memory. Therefore, the speed at which such memory is accessed is important to the overall speed of operation of the computer.
Register files have been used to permit rapid access to operands required, and to provide temporary storage of data during computations. These register files comprise fast access memory in which data from a portion of the computer's main memory may be stored while a particular task or subroutine is carried out.
Access to a register file is faster than access to main memory partially because the register file has fewer storage locations than the main memory unit. Thus, the addressing mechanism reads and decodes a much shorter address than would be required to address the main memory unit.
Upon beginning a particular task or subroutine, data associated with that particular subroutine is loaded from the main memory into the register file. Then, when the computer is finished with the subroutine, the data, including data that was changed or added during execution of the task or subroutine, is transferred from the register file back to the main memory. The register file may now be filled with data associated with the next subroutine required by or being executed by the arithmetic logic unit.
This need to transfer data back and forth between the register file and the main memory each time a different subroutine is referenced slows the process of switching between tasks or subroutines. Therefore, some have suggested using a register file divided into two sections so that data associated with two subroutines may be simultaneously stored within the register file. As generally suggested, each section of the divided register file has a fixed size.
In a computer of conventional architecture, the transfer of data between the main memory and the register file may be controlled by a memory management unit or other memory control device.
Reduced Instruction Set Computer (RISC) architecture has become prominent as a mechanism to streamline the execution of instructions by a computer processor. In such an environment, the speed of access to the memory may be more critical than in computers having a standard architecture.
A RISC architecture device uses special load and store instructions to move data between the register file and the main memory. In addition, a special register to register transfer instruction is used to move operands between registers of the register file.
A RISC controller or processor may generally execute instructions in accordance with the execution pipe illustrated in FIG. 1a. The four stages of the RISC execution pipe are instruction fetch, operand fetch from registers, execution in the arithmetic logic unit (ALU), and data memory access (read or write).
The first stage of the execution pipe is instruction fetch. The required operands are fetched from registers during the second stage. Adder, shifter, and other operations are executed during stage 3. Data memory access normally occurs during stage 4 and beyond, if necessary to complete the access. The RISC instruction execution pipe may also contain five stages, allowing for greater time for ALU operations and memory access. Other operations may be executed at stages 3 and beyond, depending on the instruction being executed.
An alternative instruction pipe for multiply operations is illustrated in FIG. 1b. The multiply and add operation may extend from stage 3 into the first part of stage 4, with the accumulate function occurring during the latter part of stage 4.
Generally, an instruction is launched each clock time, and progresses through the RISC execution pipe at the rate of one stage per instruction cycle. If an operand is not available at the required time, a hardware interlock may hold up execution of the instruction requiring that operand. The hardware interlock may also hold up instruction execution if a preceding instruction is not sufficiently complete.