1. Field of the Invention
The present invention generally relates to reduced instruction set computers (RISC) and, more particularly, to enhancing the performance of RISC processors, employing little additional hardware. Two examples of RISC technology are presented in detail in the articles (1) "The 801 Minicomputer," by George Radin and (2) "RISC I: A reduced Instruction Set VLSI Computer," by Patterson and Sequin. The complete bibliographic information for these two articles is presented more fully below.
In the semiconductor industry, current developments indicate that very large-scale integration (VLSI) has two avenues for microprocessor designers to choose from. The first is to develop increasingly complex microprocessors. The complexity is built into the hardware as more function is moved from software into the hardware. The second approach is to develop increasingly fast processors doing simple functions. This approach requires software to implement most of the function. The two articles mentioned above advocate the second approach.
VLSI circuits of greater complexity let designers use less expensive alternatives to expensive software. Hardware solutions also execute faster. Hardware implementations of software functions allow programmers to develop high-level language programs that are concise, efficient and easier to write, compile and debug.
The drawbacks to the first approach are that increasing complexity requires longer design times, a greater possibility for design errors and diverse implementations. This class of computers is referred to as complex instruction set computing (CISC) systems.
A unique approach to system architecture has been realized by following the second approach; i.e., a RISC system. The heart of this design is its CPU. The design of the system allows the user to use the major functions of the CPU. The organization differs from the CISC systems.
Mid-range central computing units (CPU)s are generally designed as microprocessors emulating the architecture of the CPU. This requires each instruction to map to several microcomputer instructions. The number of instructions necessary to execute each CPU instruction varies, depending on the power of the underlying microprocessor, the complexity of the CPU architecture and the application. For instance, an IBM S/370 model 168 will require three to six cycles per S/370 instruction.
Different application types have diverse instruction usages. For instance a computer aided design application will use floating point instructions and a check processing application will use decimal arithmetic. In most applications, there is a similarity in the most popular instructions. These instructions tend to be the simpler functions, such as load, store, branch, compare, integer arithmetic and logic shifting. These same functions are generally available on the microprocessor.
To better exploit the available functions, the primitive instruction set designed for the primitive reduced instruction set machine (PRISM) system can be directly executed by hardware. Every primitive instruction takes exactly one machine cycle. Complex functions are implemented in "microcode" similar to CISC implementations. This means they are implemented by software subroutines executing the primitive instruction set.
In a CISC implementation, the architect decides in advance which functions will be used most frequently. For example the decimal multiply function will reside in control storage while the Interrupt Handlers will be in main memory. With an instruction cache, recent usage dictates which functions will be available quickly.
This approach provides worst case capabilities equivalent to a moderately priced CPU in which the complex instructions have been microprogrammed. However, by choosing the primitive instructions with the compiler in mind, far fewer cycles are actually required.
The information presented above is intended to present the architecture of the RISC processor. For more detailed information other applications and issued patents include:
(1) U.S. Pat. No. 4,589,087 issued May 13, 1986, to M. A. Auslander, J. Crocke, H. T. Hao, P. W. Markstein, and G. Radin for "Condition Register Architecture For A Primitive Instruction Set Machine."
(2) U.S. Pat. No. 4,589,065 issued May 13, 1986, to M. A. Auslander, J. Croke, H. Hao, P. W. Markstein and G. Radin for "Mechanism for Implementing One Machine Cycle Executable Trap Instructions in a Primitive Instruction Set Computing System."
(3) U.S. patent application Ser. No. 509,734, now abandoned entitled "Mechanism for Implementing One Machine Cycle Executable Branch-On-Bit-In-Any-Register Instructions in a Primitive Instruction Set Computing System," by M. A. Auslander, H. Hao, P. W. Markstein, G. Radin and W. S. Woreley.
(4) U.S. Pat. No. 4,569,016 issued Feb. 4, 1986, to H. Hao, P.W. Markstein and G. Radin for "Mechanism for Implementing One Machine Cycle Executable Mask and Rotate Instructions in a Primitive Instruction Set Computing System."
(5) U.S. patent application Ser. No. 566,925, entitled "Internal Bus Architecture for a Primitive Instruction Set Machine," by J. Cocke, D. Fiske, L. Pereira and G. Radin.
2. Description of the Prior Art
The technology of the RISC computer is presented in two articles. These are:
(1) "The 801 Minicomputer," by George Radin, published in ACM SIGPLAN NOTICES, Vol. 17, No. 4, Apr. 1982, pages 39-47.
(2) "RISC 1: a Reduced Instruction Set VLSI Computer," in the IEEE 8th Annual Symposium on Architecture Conference Proceedings of May 12-14, 1981, pages 443-449.
The RISC computer is an instruction driven digital computer. This type of computer manipulates data to a user's specification. The user's specifications are organized into a program consisting of groups of the instructions.
The program is processed by a compiler to create an object deck. The object deck is linked with a set of other object decks to create an executable module that is in machine language. Machine language is the information that the particular hardware recognizes as instructions for it to execute.
The earliest compilers were principally interested in translating the language that the user developed the application in into machine language. As compilers became more sophisticated, they began to use optimization techniques to allow programs to execute more efficiently and faster. As optimization techniques became more refined, they began to take the target architecture into account more.
Until the RISC computer, there was always one drawback to compilers. The machine architecture was designed to optimize machine language instructions. With the advent of the RISC machine, the compiler was taken into account as the machine was designed. The RISC machine runs optimally with compiled procedures. The instructions that are generated by the compiler are designed to be executed sequentially, one or more at a time, to carry out the operation the user defined.
A typical data flow in a RISC processor consists of two fundamental execution units, the Arithmetic/Logical Unit (ALU) and the Rotate (shift) Unit. Instructions are executed sequentially by sharing output ports and using one of the units at a time. Most instructions only use one of the execution units.
Some RISC systems have branch prediction capability. In a branch prediction system, an instruction is fetched from storage and predecoded to look for branch instructions. If the instruction is a branch, the branch is processed. If not, the instruction is sent on to the processor. The processor never sees a branch instruction.
It is known that high performance can be achieved by duplicating computational units each performing identical operations in synchronism. This art is primarily used in scientific vector processors and is very costly. The principles and methods of such art are taught, for example, in U.S. Pat. No. 3,346,851 to James E. Thornton and Seymour R. Cray.
It is further known that some functional units of work can be separated into independent, distinct units to permit different operations to be performed on the same information at the same time. This is important because many operations lend themselves to specialization such as checks and comparisons on work in progress. By dividing out this work, it is possible to perform these specialized operations at the same time as other operations are occurring and avoid impacting the performance of the processor by carrying these operations out at another time. This art is presented more completely in U.S. Pat. No. 3,969,702 to Giancarlo Tessera.
It is also known that an instruction pipeline can be employed to process instructions in a time-offset between instructions. The offset is an integral multiple of the cycle time of the functional units which execute the instructions. The offset is matched to instructions that use two storage accesses per execution and each access requires one cycle. This art is presented more completely in U.S. Pat. No. 3,840,861 to Gene M. Amdahl, Glen D. Grant and Robert M. Maier.
A number of instruction processing techniques are known in prior art systems; however, there is a need for the improvement of the cost/performance ratio for RISC processor systems.