Microprocessors, including general purpose microprocessors and DSPs, do useful work by being provided with instructions, which the microprocessors execute. The set of instructions which a particular microprocessor is designed to respond to is called its instruction set.
Microprocessor instructions instruct the microprocessor to, e.g., load values in registers; store values from registers to locations in main memory; add, subtract, multiply or divide values stored in registers or other storage locations, or perform Boolean operations on such values, such as OR or compare; shift register contents; and perform a variety of control operations, including testing and setting bits, outputting data to a port, pushing data to a stack and popping stack contents. Microprocessor instructions are typically expressed in mnemonic form. For example, one typical microprocessor instruction is ADD, which is the mnemonic for an addition operation, for example, to add the contents of one register to the contents of another register and place the result in an accumulator or other register.
A common microprocessor instruction is the NOP instruction, pronounced "no op," which instructs the microprocessor to perform no operation in the clock cycle in which it processes the NOP instruction. I.e., it is a kind of "place holder." This instruction is used by programmers in cases where, before a particular instruction can be executed, some other event needs to occur which does not involve executing an instruction in the microprocessor. For example, in a pipelined microprocessor a load operation from memory to a register may precede an instruction, such as an ADD instruction, that operates on the contents of that register, and the load operation may take two or more clock cycles to complete. If the ADD instruction were to immediately follow the load operation in a program, the value to be operated on in the ADD operation would not yet be loaded in the register, and the ADD instruction would fail, or would produce an invalid result.
In such cases, programmers have availed themselves of several expedients to solve this problem. One is to insert a sufficient number of NOP instructions in the program between the load instruction and the ADD instruction, to allow the requisite number of clock cycles to complete before the ADD instruction were allowed to execute. However, this multiplies the size of the program undesirably.
Another expedient is to execute a NOP with a loop instruction immediately after the load instruction. However, the loop instruction itself may take more cycles to execute than needed for the delay, resulting in computing inefficiency.
Still another expedient is to design into the registers themselves a function called "dependency interlock." This additional circuitry associated with one or more registers in the microprocessor detects the requirement that a register be loaded with a value before the instruction operating on that value can operate, and "locks" the instruction against execution until it is detected that a value has been so loaded. See, e.g., U.S. Pat. No. 5,430,851, which issued to Hroaki Hirata, et. al., and was assigned to Matsuchita Electric Industal Co., Ltd. However, this solution is expensive in terms of silicon area. It also adds to the complexity of the microprocessor circuit design and testing. For example, a modern pipelined, superscaler DSP may contain eight or more functional units, and eighteen or more source registers. The complexity involved in interlocking all of those elements is considerable. Further, adding such circuitry may cause speed path problems. Still further, since it may be that it is sometimes desirable to use a value already in the register, it adds to the complexity of the instruction execution and interlock function. And, finally, the addition of dependency interlocks results in a less "exposed" pipeline processor. That is, the internal components of the processor, such as the registers, are less available to the compiler, which is undesirable.
The present invention introduces a novel microprocessor instruction that solves these problems.