In its simplest form, a digital computer consists of five functionally independent main parts: input, memory, arithmetic and logic, output, and control units. In the traditional organization of a so-called `single store` or `stored program` computer, activity within the computer is governed by a program consisting of a set of instructions stored in main memory. An instruction specifies an operation to be performed by the computer. Individual instructions are brought from main memory into the processor, which decodes and executes them. In addition to the machine-level instructions, numerical data used as operands for the instructions are also stored in the same main memory space, hence the name `single store` computer.
In modern computers, the arithmetic and logic unit (ALU) and its associated control circuitry are frequently grouped together to form a so-called central processing unit, or CPU, which can be implemented as a monolithic semiconductor device. A CPU usually contains, in addition to the ALU, a collection of one-word storage locations, called registers, used to hold instructions or data during program execution. Some CPU registers are all-purpose storage locations which can be used to hold any intermediate values which the CPU might need to temporarily save. Other registers have specifically defined uses. For example, a CPU traditionally has a register called the program counter (PC) which holds the address in the main memory space of the next (or currently executing) instruction. In addition, a CPU will typically have an instruction register (IR) which holds the instruction fetched from memory that is currently being executed by the computer. A CPU will also usually have additional special purpose registers, such as a memory address register and a memory data register, which are defined to contain, respectively, the main memory address of a word to be written or read by the CPU, and a data value to be stored in or received from memory. Each of the functional units of a CPU is able to pass data to and from other units via one or more CPU busses, and transfers between the units are initiated and controlled by the control unit. The sequence of individual steps required for a computer to execute a machine-level instruction is directed by a control unit, which is responsible for asserting control signals to initiate transfers of data from one register to another, or between main memory and registers, or between registers and the ALU, via a CPU bus.
Often, a single machine level instruction involves several CPU register transfers, each initiated by the assertion of one or more control signals. In the typical case of a simple memory read, for example, the control unit would cause the following steps to be taken: First, after reading from memory the machine-level instruction requesting the memory read, the address of the desired memory location would be loaded into the memory address register via a CPU bus; the control unit accomplishes this by asserting a signal which tells the memory address register to latch the data currently on the CPU bus. Once this is done, the control unit would initiate a read at the memory location specified in the memory address register by asserting a different control signal. Next, when the desired location has been read, the memory unit would make the value available on the CPU bus, and the control unit would assert another control signal causing the contents of the CPU bus to be latched into the memory data register. Here the value would be subsequently available, via the CPU bus, to any of the other functional units which require it.
Typically, machine-level instructions consist of at least two parts: an operation code or `opcode` which specifies which operation is to be performed, and one or more address fields which specify where the data values (operands) used for that operation can be found. A computer executes one machine-level instruction in what may be called an instruction cycle, and an instruction cycle usually includes two parts. During the first part, called the fetch cycle, the opcode of the next instruction (the instruction pointed to by the PC) is loaded into the CPU's instruction register. Once the opcode is in the IR, the second part, called the execute cycle, is performed (often each of these parts will require more than one state time or clock cycle). In the execute cycle, the controller interprets the opcode and executes the instruction by sending out the appropriate sequence of control signals to cause data to be transferred or to cause an arithmetic or logical operation to be performed by the ALU. In many cases, the execution cycle of an instruction will include the fetching of operands specified by that instruction. For example, for a simple `ADD` instruction, the location of two values to be added together must be specified in the instruction, and these two values must be fetched and brought into CPU registers so that the ALU will have access to them.
The collection of machine-level instructions that a CPU supports is called its instruction set. Within the computer, instructions are represented as sequences of bits arranged in memory words, identifying the constituent elements of the instruction. In addition to the opcode and the source operands (operands used as inputs for the specified operation), an instruction might also identify the desired destination of the operation's result, as well as the location of the next instruction to be executed. This suggests that a plurality of operand fields may exist for a given instruction. The arrangement of fields in an instruction is known as the instruction format. Depending upon how much information must be specified, an instruction can be one or more memory words long. Some simple instructions, like `HALT`, might require no operands, and can be represented in a single memory word containing only the opcode for that instruction. Other instructions, called unary instructions, might require the specification of a single operand. Unary instructions, such as `BRANCH`, `INCREMENT`, or `CLEAR` can be contained in a single memory word containing both the opcode and the operand specifier, or in two words, one for the opcode and one for the operand specifier. Likewise, a binary instruction, like `ADD` or `SUBTRACT`, requires the specification of two operands, and can be contained either in a single word which contains the opcode and the identity of both operands, or in two or more words. Frequently, a CPU's instruction set will include instructions of varying formats. In such a case, the CPU controller must decode an instruction's opcode, and then fetch additional words as required by the format associated with that particular instruction.
In order to reference a large range of locations in the specification of operands, the instruction sets of most computers support a variety of different methods, called addressing modes, for indicating the effective address of operands in a machine-level instruction. The simplest method of specifying an operand in a machine-level instruction is immediate or literal addressing, in which the value of the operand is included in the machine-level instruction itself. This mode is useful in defining constants and initial values of variables, and no main memory references are needed to obtain the operand. Likewise, a register addressing mode requires no main memory accesses, since the operand is stored in a CPU register which is specified in the machine-level instruction. Another simple form of operand specification is called direct addressing, wherein the main memory address of the operand is supplied in the instruction. Direct addressing requires only one memory reference to obtain the operand, but is limited to specifying addresses which are small enough to fit in the address field of the instruction, which can be smaller than one memory word. Indirect addressing modes overcome this limitation by providing in the instruction an address of a word in main memory which in turn contains a full length address of the operand. Two main memory accesses are required to obtain an operand specified in an indirect addressing mode. Other addressing modes commonly supported by processing units include register indirect addressing, in which a CPU register specified in the instruction contains the address in main memory of the desired operand, and various kinds of displacement addressing, in which an explicit offset value contained in the instruction is used as a displacement from another address included in the instruction, whose contents are added to the offset value to obtain the address of the operand.
Since the control unit is responsible for initiating all activity within a CPU, acceptable system performance depends upon efficient operation of the control unit. Two basic strategies exist for implementation of a control unit. The first, known as a hardwired controller implementation, involves the use of combinatorial logical hardware that produces the appropriate sequence of output control signals in response to a particular opcode. The primary inputs to a hardwired controller are the instruction register and a clock. The combinatorial logic in the hardwired controller identifies (decodes) the unique opcode associated with every instruction, and asserts the appropriate sequence of output control signals necessary to accomplish the requested task. Clearly, even for a moderately sized instruction set, the controller must contain a large amount of logic hardware for distinguishing between the many instruction formats that may be used, the many different operations that can be performed, and for asserting the correct control signals during the various phases of each of the various instruction cycles.
A simpler approach to the design of processor control units is known as a microprogrammed implementation. In a microprogrammed controller, each instruction opcode indicates an address in a special CPU memory where CPU control words called microinstructions are stored. Each microinstruction contains information about which control lines to assert when that microinstruction is executed by the controller. A single machine-level instruction thus corresponds to a specific sequence of one or more microinstructions to be executed (a microprogram routine). Microinstructions typically reside in a dedicated area of CPU read-only-memory (ROM) called the micro-store. Upon receiving an instruction's opcode in the IR, a microprogrammed controller would derive from the opcode an entry point into a microprogram routine in the microstore. Starting at this entry point, microinstructions are read one at a time from the micro-store, just as machine-level instructions are fetched from main memory, and used by the controller to determine which control signals to assert. In what is called a fully horizontal microinstruction format, each bit of a microinstruction corresponds to a single control signal, so that a one in a certain bit position corresponds to the assertion of that control signal, and a zero in the same bit position corresponds to the de-assertion of that control signal. Alternatively, with a so-called fully vertical microinstruction format, the bits of the microinstruction must themselves be decoded to determine which control signals to assert. In the fully horizontal format, n bits are required to represent the state of n control signals, while in the fully vertical format, n bits can be used to represent the state of 2.sup.n control signals. A diagonal microinstruction format represents a compromise between the speed of the fully horizontal format and the small microinstruction size of the fully vertical format. In a diagonal format, the bits of a microinstruction are grouped into fields, where each field corresponds to the control signals associated with a particular system resource (main memory, ALU, CPU bus, CPU registers, and so on). A field of k bits can be used to specify up to 2.sup.k control signals associated with a particular resource.
A computer is often characterized by the instruction set supported by its processing unit. The instruction set determines many of the functions performed by the CPU and thus has a significant effect on the implementation of the CPU and the overall performance of the computer system. In practice, at least two differing approaches to the design of instruction sets exist in the art. One, referred to as a complex instruction set, attempts to make the machine-level instructions more compatible with instructions of higher-level programming languages. Complex instruction sets typically include a large number of different instructions, ranging from simple operations such as MOVE, to more complex operations, such as vector manipulation, matrix operations, and sophisticated arithmetic functions. The instructions in a complex instruction set computer (CISC) usually support a wide range of operand addressing modes, and have widely varying formats. The use of complex instruction sets is based on the reasoning that since a complex instruction would do more than a simple instruction, fewer instructions would be needed for a given program, thus reducing the number of memory fetches involved in the execution of a program. The increasing availability of faster, larger ROMs further suggested that implementing complicated software functions in larger micro-code routines would result in faster, easier-to-use computers, and lower costs for software development.
An alternative approach to instruction set design employs a reduced instruction set. Reduced instruction set computer (RISC) architecture refers to a broad design philosophy characterized by an instruction set comprised of a relatively small number of simple instructions. The machine-level instructions in a RISC computer typically execute in one clock cycle, have fixed instruction formats, and support only a few simple addressing modes. In order to take full advantage of the speed of RISC instruction execution, RISC computers customarily employ hardwired controllers instead of microprogrammed controllers.
The general-purpose instruction set most commonly implemented in computers represents a compromise between the smaller program size and larger controller micro-store size of CISC architectures and the larger program size and smaller controller size of RISC systems. The wide variety of instruction formats and addressing modes in CISC systems makes instruction decoding and operand fetching slower and more involved, while the simpler operations and minimal addressing modes of RISC systems require relatively longer programs to accomplish similar tasks.
In a microprogrammed computer, the complexity of a machine-level instruction set clearly has an impact on the size of the micro-store containing the microinstructions to support the instruction set. The size of the microstore depends not only on the number of instructions supported, but also on the complexity of these operations and the range of addressing modes available to the instructions. The performance characteristics of the computer system are also affected by these factors, since complex instructions involve more CPU register transfers than simpler instructions and thus require more microinstructions and more clock cycles to complete. Furthermore, as the instruction set's complement of addressing modes increases, a larger amount of microcode is required for operand address calculation.
Microcode processing of operand specifiers involves dispatching to a microprogram routine which will perform the actions necessary to identify the location of an operand, and then taking the appropriate action based on the use of the operand in the given instruction. In lower performance systems, the processing of all specifier types may be done by a single general microprogram routine which resolves differences in operand specification by using conditional branching statements in the microprogram routine. Higher performance systems, on the other hand, may provide a dedicated microprogram routine for each possible combination of operand specifiers used with a particular machine-level instruction. This dedicated routine method allows the system to achieve maximum performance by eliminating the need for slow conditional branching within the microprogram. The general purpose routine method, on the other hand, requires much less micro-store memory space than the dedicated routine method, but is also slower.
The present invention is aimed at overcoming some of the undesirable features associated with the implementation of complex instructions sets which support a variety of operand addressing modes. Specifically, this invention suggests a strategy for reducing the micro-store memory dedicated to operand specifier processing, while also reducing the size and the time spent by a processor in obtaining the necessary operands specified in complex instructions.