As consumer demand for higher performance computers increases, the speed of processors must also increase. A processor manipulates and controls the flow of data through a computer, and as the processor speed increases, the computer generally becomes more powerful. One way processor designers increase the processor's speed is through a technique called data forwarding. Data forwarding increases processing speed by providing data to an execution unit of the processor before waiting to first store and then retrieve the data from a memory location, as described in more detail below.
Software applications include programming instructions that are executed by a processor. Many of theses instructions, particularly mathematical instructions, include one or more source addresses, an operator, and a destination address. The source addresses are the memory locations where the source data (or operands) are stored. The processor retrieves the source data from memory and provides the data to an execution unit. The execution unit manipulates the source data according to the operator, and the result is stored in memory at the destination address. The transfer of data is coordinated by a control unit within the processor.
The source data is stored in the source address by a previously executed instruction. For example, the destination address of a previously executed mathematical instruction may be the source address of a subsequent instruction. Alternatively, the target address of a previously executed load instruction, instructing the processor to transfer data from a first memory location to the target address, may be the source address of a subsequent instruction. Destination, target, and other return addresses and data are also referred to as result addresses and data.
For example, for the instruction "LOAD [R(y)].fwdarw.R(a)," R(a) is a memory register, and [R(y)] corresponds to another memory location which may or may not be a register. R(a) is the result address of the load instruction, and [R(y)] is an address containing the result data to be stored in the result address. The processor that executes the load instruction transfers the data from address [R(y)], passes this data to the register set containing register R(a), and stores the data in register R(a).
Suppose that the add instruction "ADD R(a)+R(c).fwdarw.R(d)" follows the above load instruction in the program code. The operator of this instruction is addition "+", the source addresses are R(a) and R(c), and the destination address is R(d). Addresses R(a), R(c), and R(d) are memory locations in a register set. The processor that executes the add instruction locates the source data in registers R(a) and R(c) and transfers the data to an execution unit within the processor. The execution unit, which may include an arithmetic logic unit (ALU) or floating point unit (FPU), adds the source data values, and the result is stored in register R(d).
Note the redundancy between the add and load instructions above with respect to the data stored in register R(a). The load instruction takes time to store the result data in result register R(a). The add instruction takes time to transfer this same data from the same register, R(a), and to pass this source data to the execution unit. In a processor that supports data forwarding, the time delay associated with this redundancy is eliminated by providing the result data of the load instruction directly to the execution unit before storing the result data in the result address. Consequently, processing speed is improved. Note that, as used herein, "data R(n)" refers to the data stored at address R(n), "register R(n)" refers to the register having address R(n), and "address R(n)", "memory location R(n)", and "R(n)" refer to the address of R(n).
Unfortunately, supporting data forwarding requires the use of a number of large comparators to compare the result register addresses of previously issued instructions to the source register addresseses of subsequent instructions. If a match is found, then the result data is forwarded directly to the execution unit as source data, otherwise, the source data is transferred from the appropriate register bank. Because Comparators take up a significant amount of space, the large number of comparators in more complex processors can increase the size and cost of the processor.