Processors used in digital computer systems process data in accordance with a series of instructions which comprise an instruction stream. Typically, each instruction will contain one or more elements, which can include, for example, an operation code that identifies the particular operation to be performed, one or more source operand specifier(s) which identify storage locations in the respective digital computer system which contain the data, or "operands," on which the operation is to be performed, and a destination identifier specifier which identifies a storage location in the system in which the processed data is to be stored. Some types of instructions may only comprise operation codes, particularly if they are provided to control the sequence of operations in the instruction stream, while other types of instructions also include the source and destination identifier specifiers.
Typically a processor performs a series of general phases in connection with processing of an individual instruction including:
(i) decoding the instruction's operation code to determine the type of operation to be performed, and coincidentally to identify the number of operands, if any;
(ii) if the instruction requires operands, retrieving the operands from the storage locations identified by the instruction;
(iii) performing the operation required by the instruction; and
(iv) storing the result in the storage location identified by the instruction.
A processor could perform each of the above-identified phases in series for each successive instruction in the instruction stream in series, with very little if any overlap. In that case, for example, the processor would begin decoding the operation code of instruction (s+1) of the instruction stream (phase (i) above) after the result generated for instruction (s) of the instruction stream (phase (iv) above) has been stored.
Each of the phases (i) through (iv) can generally be performed by different circuit elements in a processor, and so so-called "pipelined" processors were developed in which each of the phases could be performed concurrently with the other phases, but for different instructions in the instruction stream. See, for example, Peter M. Kogge, The Architecture Of Pipelined Computers (McGraw-Hill Book Company, 1981) (hereinafter, "Kogge"). A pipelined processor may execute successive instructions in an instruction stream, in successive phases (i) through (iv), such that, while, for example, the processor is storing the result of (phase (iv) above) of instruction (s), it will concurrently be
(a) performing the operation required by instruction (s+1) in connection with its operands (phase (iii) above),
(b) retrieving the operands required by instruction (s+2) (phase (ii) above), and
(c) decoding the operation code of instruction (s+3) (phase (i) above).
It will be appreciated that, if a sequence of four instructions (s) through (s+3) in an instruction stream can be processed concurrently in this manner, the instructions can be processed in seven time steps rather than the sixteen time steps that would be necessary in a non-pipelined processor. In many cases, sequences of instructions from an instruction stream can be executed in this manner, which can lead to substantially reduced processing time and increased through-put. As described in Kogge, processors having a variety of other, more complex pipelines have been developed.
Problems can arise, however, in connection with pipelined execution of instructions. During processing, unusual conditions variously known as "faults," "traps" or "exceptions" (generally "exceptions") may be encountered which need to be handled. A number of types of exceptions may arise during processing. The specific types of exceptions are generally determined by the particular architecture which defines the operation of a particular type of processor; the types of exceptions that are provided for one particular type of microprocessor, which is constructed in accordance with the SPARC Version 9 architecture, is described in the SPARC International, Inc [David L. Weaver and Tom Germond (eds)], The SPARC Architecture Manual Version 9 (Prentice-Hall, 1994) (hereinafter referred to as "the SPARC Architecture Manual, Version 9"), chapter 7.
When a processor detects an exception in connection with processing of an instruction, it calls an exception handler to process the exception, that is, to perform selected operations as required by the exceptional condition. Two general methodologies have been developed for handling exception conditions. In one methodology, which is representative of computers whose processors follow a "precise" exception handling model, if an exception condition is detected during processing of an instruction, the exception handler is invoked immediately following operations performed for the instruction. On the other hand, in a second methodology, which is representative of processors whose architectures specify a "delayed" exception handling model, if an exception is detected during processing of an instruction, the exception handler is not invoked until some point after the processor has sequenced to processing an instruction after the instruction for which the exception was indicated. Some architectures make use of both the precise and delayed exception handling models for different types of exceptions.
In both methodologies, the processor will generally need to retain certain exception status information, perhaps for some time, after the instruction for which an exception condition is detected so that, when the exception handler is invoked, it has the information which it needs to process the exception. One benefit of the handling exception conditions according to the precise exception handling model is that, since the exception handler is processed immediately after the processing of the instruction which gave rise to the exception condition, the exception status information needed by the exception handler will be directly available and need not be saved beyond processing for the program instruction which gave rise to the exception condition.
With exceptions handled according to the delayed exception handling model, however, the processor will generally need to ensure that certain exception status information be retained, perhaps for some time, after the floating point instruction for which an exception condition is detected so that, if the exception handler is eventually invoked, it has the information which it needs to process the exception. The instruction which causes an exceptional condition which is handled according to the delayed exception handling model casts a "trap shadow" over subsequent instructions in the instruction stream, the trap shadow representing the time following the execution phase of an instruction, to the point at which if an exception condition would have occurred in connection with processing of the instruction, the exception condition would actually have occurred. Otherwise stated, if an exception condition is detected in connection processing of an instruction, the exception condition will have been detected by the end of the trap shadow. Different types of instructions can have trap shadows of different lengths, but generally the length of an instruction's trap shadow can be predicted a fortiori based on the type of processing operation to be performed while the instruction is being executed.
To accommodate pipelining in such an environment, the programmer (in the case of a program written in assembly language) or compiler (in the case of a program written in a high-level language), will need to ensure that instructions in the instruction stream which follow an instruction which may cause a trap will obey certain restrictions, which will allow the trap handler, when called, to identify the particular instruction which gave rise to the exception condition, handle the exception condition by emulating the instruction or otherwise compensate for the exception condition, and then re-execute the remaining instructions of the trap shadow. The restrictions include such rules as:
(i) no instruction in the trap shadow may enable data to be stored in a register that is read in response to another instruction in the trap shadow; that is, no instruction in the trap shadow may have a destination identifier which identifies a register that is also identified by the source operand of another instruction in the trap shadow;
(ii) no two instructions in the trap shadow may enable data to be written to the same register; that is, no two instructions in the trap shadow may have destination identifiers which identify the same register; and
(iii) no instruction in the trap shadow may enable a jump, branch, or similar discontinuity in the processing of the series of instructions in the instruction stream.
Generally, when the computer begins processing the exception handler, it will store the state of the computer at the point in time at which the exception handler begins, which will, in a pipelined computer, reflect at least some operations that are performed in executing instructions in the various stages of the pipeline after the instruction that gave rise to the exception condition and before the exception handler is called. When the exception handler finishes handling the exception handler, it will effectively retrace the operations performed in processing the instructions in the pipeline so that it will be able to place the computer in an appropriate state so that it can resume processing the instructions in the instruction stream. These restrictions will ensure that the exception handler will be able to return the computer to the appropriate state.
The third restriction, that is, that no instruction in the trap shadow may enable a discontinuity in the processing of the series of instructions in the instruction stream, requires that an assembly language programmer (if the program is being written in the assembly language of the particular processor on which the program is to be processed) or the compiler (if the program is being written in a high-level language) ensure that trap shadows of instructions that might cause exceptions are cut off before an instruction is executed which can result in a discontinuity in instruction processing. This can present a particular problem in connection with loops which are commonly used. In a loop, a particular instruction sequence can be executed a number of times, the number depending on detection of a selected processing condition. Generally, the end of the instruction sequence is a conditional branch instruction which determines whether the processing condition has occurred. If the processing condition has not occurred, the conditional branch instruction enables a branch to occur which enables processing beginning with the start of the instruction sequence. On the other hand, if the processing condition has occurred, the conditional branch instruction enables the next instruction in the instruction stream after the loop to be processed. Loops are commonly encountered in an instruction stream, and the third restriction can result in a significant problem, particularly in short loops where an instruction's trap shadow may extend to and beyond the conditional branch instruction.
To accommodate the restrictions noted above, particularly the third restriction, some architectures provide for a particular instruction, called a "trap barrier" instruction, which the programmer or the compiler can insert into the instruction stream at some point after respective instructions which can result in exceptional conditions which can serve to cut off trap shadows, and prior to an instruction which causes a discontinuity in processing of the instructions in the instruction stream to ensure that, in particular, the third, restriction above is observed. Problems arise, however, in connection with use of the trap barrier instruction. The trap barrier instruction effectively stalls processing of instructions in the pipeline after the trap barrier instruction, including the instruction which causes the discontinuity as well as instructions subsequent thereto, until the processor has completed processing of the instructions preceding the trap barrier instruction, at least to a point at which it (that is, the processor) can determine whether the exceptional condition can be detected in connection with processing of instructions preceding the trap barrier instruction, that is, until the end of the trap shadows of all of the instructions preceding the trap barrier instruction in the pipeline. Since each trap barrier instruction in the instruction stream "stalls" the pipeline whether or not an exceptional condition is detected, it can result in a significant decrease in the processing performance that may otherwise be provided by the pipelining of the processor, particularly in connection with loops and other discontinuities in the instruction stream.