1. Field of the Invention
The present invention relates to the management of the program-visible machine state of computers, and more particularly, to a computer register file system and method adapted to handle exceptions which prematurely overwrite register file contents.
2. Related Art
A more detailed description of some of the basic concepts discussed in this application is found in a number of references, including Mike Johnson, Superscalar Microprocessor Design (Prentice-Hall, Inc., Englewood Cliffs, N.J., 1991); John L. Hennessy et al., Computer Architecturexe2x80x94A Quantitative Approach (Morgan Kaufmann Publishers, Inc., San Mateo, Calif., 1990). Johnson""s text, particularly Chapter 5, provides an excellent discussion of register file exception handling.
Supporting exception handling and in particular precise interrupts, presents a complicated set of problems for the computer architect. For example, the result of a particular instruction cannot be written to a central processor unit""s (CPU) register file, or any other part of the program-visible machine state, until after it can be determined that the instruction will not signal any exceptions. Otherwise, the instruction will have an effect on the visible state of the machine after the exception is signaled. (The terms CPU, computer and processor will be used interchangeably throughout this document).
Historically, this problem has been circumvented by increasing the number of processor pipeline stages (pipeline depth) so that the write does not occur until after the latest exception is determined. However, this reduces the allowable degree of instruction interlocking and/or increases the amount of by-pass circuitry required, either of which typically degrades overall performance.
The concept of a xe2x80x9chistory bufferxe2x80x9d is described by J. E. Smith et al. (xe2x80x9cImplementation of Precise Interrupts in Pipelined Processorsxe2x80x9d, Proceedings of the 12th Annual International Symposium on Computer Architecture (June 1985), pp. 36-44), as a means for implementing precise interrupts in a pipeline scalar processor with out-of-order completion. In this approach, the register file contains the program-visible state of the machine, and the history buffer stores items of the in-order state which have been superseded by items of lookahead state (i.e., it contains old values that have been replaced by new values; hence the name history buffer).
The history buffer is managed as a circular buffer. Each entry in the history buffer is assigned an entry number. There are n entries in the history buffer, where n corresponds to the length of the longest functional unit pipeline. A head and a tail tag are used to identify the head of the buffer, and the entry in the buffer reserved for the instruction, respectively. Entries between the head and tail are considered valid.
At issue time, each history buffer entry is loaded with: (1) the value of the register file prior to the issuing of the instruction, and control information including: (2) a destination register of the result, (3) the program counter, and (4) either an exception bit or a validity bit, depending on whether an exception is generated at the time of issue.
A Result Shift Register is used in conjunction with the history buffer to manage various machine control signals, including a reorder tag which is required to properly restore the state of the machine due to out-of-order completion. The result shift register includes entries for the functional unit that will be supplying the result and the destination register of the result. The result shift register is operated as a first-in first-out (FIFO) stack.
Results on a result bus from the processor""s functional unit(s) are written directly into the register file when an instruction completes. Exception reports come back as an instruction completes and are written into the history buffer. The exception reports are guided to the proper history buffer entry through the use of tags found in the result shift register. When the history buffer contains an element at the head that is known to have finished without exceptions, the history buffer entry is no longer needed and that buffer location can be re-used (the head pointer is incremented). The history buffer can be shorter than the maximum number of pipeline stages. If all history buffer entries are used (the buffer is too small), issue must be blocked until an entry becomes available. Hence, history buffers are made long enough so that this seldom happens.
When an exception condition arrives at the head of the history buffer, the buffer is held, instruction issue is immediately halted, and there is a wait until pipeline activity completes. The active buffer entries are then emptied from tail to head, and the history values are loaded back into their original registers. The program counter value found in the head of the history is the precise program counter.
The extra hardware required by this method is in the form of a large buffer to contain the history information. Also the register file must have three read ports since the destination value as well as the source operands must be read at issue time.
In view of the forgoing, it is clear that a simplified backup system is therefore required to handle exceptions.
The present invention is directed to a register file backup queue system and method for use with a computer which processes instructions to generate results which thereby change the visual state of the computer. The computer has a register file with a plurality of addressable locations for storing data. The backup system of the present invention is adapted to return the visual state of the computer to a previous state if an instruction generates an exception. The backup system utilizes less overhead so as to provide easier register file backup than a comparable software or hardware device.
The present invention sequentially stores in program order in a result tag queue, address information corresponding to destination locations in the register file where instruction results are to be stored.
From the result tag queue, a first portion of the address information is transferred to the register file and a second portion of address information is transferred to a backup queue for backup storage of the register file contents.
The backup queue also receives and stores further information corresponding to the contents of one or more destination locations in the register file before that destination location is changed according to said second portion of said address information.
The present invention transfers said further information from said backup queue back to the register file locations according to said second portion of said address information stored in said backup queue if an instruction exception is generated.
Before an instruction is retired, the value of any program-visible state that an instruction may modify (including, but not limited to, the prior value of the register file destination register) is read such that all instructions up to and including the previous instruction have taken effect prior to the read. The resulting data are placed in the backup queue that, in effect, xe2x80x9cremembersxe2x80x9d the program-visible state of the processor exactly prior to any given xe2x80x9cuncommitted instructionxe2x80x9d, and thus can be used to nullify the effect of any instruction that causes an exception. (An xe2x80x9cuncommitted instructionxe2x80x9d is defined by Hennessy et al. as an instruction that may cause an exception at some future time.)
The present invention thus provides a mechanism by which interrupts can be supported for exceptions that are signaled after the result is written and without out-of-order completion. Design complexity is minimally increased, in that the pipeline depth of the processor does not need to be increased to handle the late-exception case. This approach is easier to xe2x80x9ctack onxe2x80x9d to an existing design (e.g., in the case that an enhancement makes the late-exception case possible where it was not possible before) than increasing the pipeline depth. In some configurations, overall performance is not significantly impacted, except in the case that an exception occurs.
The foregoing and other features and advantages of the present invention will be apparent from the following more particular description of the preferred embodiments of the invention, as illustrated in the accompanying drawings.