The present invention relates to computers and, more particularly, to microprocessors. A major objective of the present invention is to provide for more efficient program execution.
Much of modern progress is associated with advances in computer performance. Recent computers typically use one or more microprocessors to execute programmed operations. Each microprocessor design is characterized by the set of instructions it can recognize and execute. The instruction sets of early microprocessors included a relatively small number of simple instructions. Accordingly, many instructions could be required to implement operations such as addition and multiplication. Succeeding generations of microprocessors accommodated more instructions and more complex instructions, thus reducing program length as well as programming time.
To provide for synchronous operation, instructions progress according to fixed-period instruction cycles. Simple instructions can be performed in a single instruction cycle, while more complex instructions may require multiple instruction cycles. Most instructions can be completed before the end of a cycle; the remainder of the cycle is, in a sense, wasted.
This wasted cycle time can be minimized at the microprocessor design stage by selecting a short instruction cycle. However, a shorter instruction cycle increases the number of instructions that must be performed in multiple cycles. There is overhead involved in managing multi-cycle instructions. This overhead, in addition to that associated generally with larger instruction sets, results in increased microprocessor complexity and size. The weight of industry opinion is that these increases in size and complexity more than offset the advantages of adding more multi-cycle instructions to the instruction sets of microprocessors.
Increasingly, processors are designed as "reduced instruction-set computers" (RISC). In the RISC approach, a relatively small set of, preferably single-cycle, instructions is used. This approach takes better advantage of integrated circuit real estate and generally improves processor throughput. Disadvantageously, the number of instructions required to implement an operation is increased. However, compilers have been developed that can generate suitable instructions from a high-level programming language. This relieves the programmer of the burden of generating the long program code required by the small instruction set.
Preferably, all or most instructions are executed within a single instruction cycle. This minimizes the circuitry required to manage instructions of varying length. A disadvantage is that the instruction cycle must be matched to the longest single-cycle instructions. Instructions that could be executed in less time still consume an entire cycle. Overall processor throughput is thus closely tied to time required to perform the longest single-cycle instruction.
In some cases, a microprocessor architect can choose between: 1) executing an operation using a single instruction to save cycles; and 2) executing an operation using multiple instructions so that a shorter instruction cycle can be used. Shift and zero detection are two relatively short operations that can be optionally combined with various other operations, e.g., arithmetic and logic operations. Shift is used, for example, in conjunction with addition to facilitate multiplications; zero detection is used as a branch condition, for example, to avoid a subsequent division by zero.
Because of the frequency of its use, multiplication plus zero detection can define a useful single instruction. Multiplication is a relatively long instruction, but zero detection can be achieved in a relatively short time. For example, the bits of a number can be NORed together so that a high output indicates a zero product while a low output indicates a non-zero product. Even though the additional time required for the zero detection is short, it can have a large impact on throughput if the instruction cycle is lengthened to permit its execution within a single multiplication cycle. In that case, the time required for zero detection is added to all instructions whether or not they involve a zero detection. The alternative is to perform the multiplication and the zero detection as separate instructions. However, this is wasteful because entire cycles must be devoted to the zero detections, which should only consume a fraction of a cycle.
Likewise shifts are frequently used with data processing operations such as addition, subtraction, AND, XOR, and others. Combining shift with these operations in a single instruction increases the throughput of such instructions. However, the longer instruction cycle required increases the execution of other instructions that do not involve a shift. What is needed is an approach that minimizes the practical tradeoffs between the one cycle and the two cycle implementations of such combinations of operations.