The designs of processors continue to increase the speed at which applications are executed therein. Moreover, certain processors, such as RISC (Reduced Instruction Set Computer)-based processors, include a reduced set of instructions that enable these processors to execute faster in comparison to processors with a more complex instruction set and having a greater number of instructions, such as CISC (Complex Instruction Set Computer)-based processors.
Typically, these processors having a reduced instruction set are, therefore, limited in the number as well as size of instructions. In particular, the size of the instructions is such that a limited number of operands can be included within the instruction. For example, such instructions may be limited to a two operand format, wherein the two operands within the instruction are the two source operands and wherein one of these source operands also serves as the destination register for the results of the execution. However, in certain instances, it may be advantageous to have the destination register that is different from one of the source registers. In particular, if the value of the source operand stored in the source register (that is to be employed as the destination register) will be used in future instructions, this value of the source operand should be preserved, rather than being overwritten. If overwritten by the destination result, this value will again need to be retrieved and stored in one of the registers. This can lead to pipeline stalls and other delays in the execution of the instructions. Therefore, a number of approaches have been developed in order to allow for a destination register that is different from one of the source registers.
One technique was to include multiple instructions to achieve a result wherein the destination register was different from one of the source registers. Specifically, a first instruction would perform the given operation, such as an addition operation, which was followed by a second instruction that would comprise a move operation (the result was moved from one of the source registers to a different register). Accordingly, the move operation is executed to copy the source register to a different register. Therefore, the original source register or the source register to which the value was copied can be employed as the destination of a subsequent instruction. However as described, this approach requires additional instruction execution, thereby slowing down execution of the application that includes such instructions.
Another typical approach to allow for a destination register that was different from one of the source registers in an instruction is to define a larger instruction format. In particular, the larger instruction format would increase the size of the instruction to allow for the specifying of three operands (e.g., a 32-bit format). However, such an approach would increase the code density as well as the instruction fetch bandwidth for such instructions.
Another conventional approach to allow for a destination register that was different from one of the source registers in an instruction is to include a prefix instruction that designates a different destination register for the subsequent instruction. In particular, the prefix instruction is not an instruction that is executed, but rather serves to inform the decoder and execution unit (within the processor) that the result is to be stored in destination register that is different from one of the source operands. Disadvantageously, this approach also slows down execution of the instruction and therefore the execution of the application that includes such an instruction. In particular, this prefix instruction is executed within the critical timing path of the processor (as the prefix instruction is processed prior to processing the actual instruction).
To help illustrate, such an approach can include a number of instruction registers coupled to a multiplexer. Similarly, the register file (that comprises a number of data registers) is coupled to a multiplexer. In operation, when the decode logic determines that the instruction serves as a prefix to the subsequent instruction, the decode logic controls the multiplexer such that the addresses of the instruction from the second instruction register are transmitted to the register file, while the operational code of the instruction from the second instruction register is transmitted to the decode logic. Conversely, when the decode logic determines that the instruction is not a prefix to the subsequent instruction, the decode logic controls the multiplexer such that the addresses of the instruction from the first instruction register are transmitted to the register file, while the operational code of the instruction from the first instruction register is transmitted to the decode logic. Accordingly, a number of gate delays are introduced into the critical timing path of the processor in order to accommodate a prefix instruction, thereby slowing down the execution speed of the processor when processing such instructions.