The present invention generally relates to the execution of instructions in computer systems and more particularly to a trampoline mechanism for effecting control flow change in a computer system to emulate a branch, such as a trampoline mechanism employed for recovery of an exception caused by advanced or speculatively executed instructions.
Computer systems include at least one processor and memory. The memory stores program instructions, data, and an operating system. The program instructions can include a compiler for compiling application programs. The operating system controls the processor and the memory for system operations and for executing the program instructions.
A xe2x80x9cbasic blockxe2x80x9d is a contiguous set of instructions bounded by branches and/or branch targets, containing no branches or branch targets. This implies that if any instruction in a basic block is executed, all instructions in the basic block will be executed, i.e., the instructions contained within any basic block are executed on an all-or-nothing basis. The instructions within a basic block are enabled for execution when control is passed to the basic block by an earlier branch targeting the basic block (xe2x80x9ctargetingxe2x80x9d as used here includes both explicit targeting via a taken branch as well as implicit targeting via a not taken branch). The forgoing implies that if control is passed to a basic block, then all instructions in the basic block must be executed; if control is not passed to the basic block, then no instructions in the basic block are executed.
Thus, control flow between basic blocks in computer programs is typically effected with branches. A branch typically evaluates some condition and conditionally changes control flow to a new basic block, based on the outcome of the condition.
In some cases, a branch is very unlikely to be taken, such as when the branch occurs as a result of an error condition which generally does not arise during normal program execution. For such unlikely cases, branches are still typically employed to effect control flow change to a special basic block to handle the error condition.
An alternative approach that has been employed for handling error conditions is for the error condition to cause an interruption. The interruption generally invokes a different control flow mechanism and redirects execution to a special program referred to as a generic interruption handler, which is typically part of an operating system. Typically, the generic interruption handler handles the error condition in one of two ways depending on the type of error condition.
In the first scenario, the generic interruption handler responds to the error condition and resolves the problem associated with the error condition. The program which caused the error condition is then restarted at the point where the program encountered the error. In this first scenario, the error condition is generic across programs (e.g., a translation look-aside buffer (TLB) translation miss error condition). Therefore, the resolution of the error condition is performed entirely by the generic interruption handler. The generic interruption handler resolves the error condition in the same way no matter what function the program that encountered the error condition performs.
In the second scenario, the generic interruption handler determines that the program that encountered the error condition is to handle the error condition. Typically, the program provides some information to the generic interruption handler to indicate that the program itself is to handle the error condition. In addition, the program typically indicates which portion of the program should be invoked upon an error condition. After the generic interruption handler determines that the generic interruption handler cannot deal with the error condition itself and that the program that encountered the error condition has indicated that the program is to deal with the error condition, the generic interruption handler typically builds a data structure and invokes the program""s own specific handler. The data structure built by the generic interruption handler is typically very large, because the generic interruption handler does not have sufficient information to determine which data items are required to handle the error condition. The error condition is then handled by the program directly. After the program completes the handling of the error condition, the operating system is invoked again to restart the program at the point where the error was encountered. This approach for handling an error condition is referred to as a signal handler approach and the generic interruption handler is referred to as a signal handler.
In summary, when changes in control flow are effected based on conditions which are relatively infrequent and which require specialized handling (i.e., not generic handling), there are two basic approaches that have been employed to effect control flow. In the first approach, a branch is made to the specialized handler portion of the program. In the second approach, a generic signal handler which is typically part of the operating system, is invoked via an interruption to control flow and the generic signal handler builds a data structure and invokes the specialized handler portion of the program.
These two basic approaches have generally worked well for simple scalar processors. Either approach generally involves some instruction, such a conditional branch or some other instruction that causes an interruption based on a condition. In earlier computer systems, the overhead for conditional branch instructions was similar to the overhead for other instructions that cause an interruption based on a condition.
The overhead for branch instructions, however, has increased significantly with present day computer systems and is projected to increase further with future computer systems. In many computer systems, some costly branch mispreditions are avoided when programs indicate that the branch is unlikely and the processor factors this into branch prediction. Nevertheless, as computer systems become wider, the need to execute multiple branches each cycle increases. Executing multiple branches each cycle requires additional resources and/or complicates branch prediction which decreases the accuracy of the prediction. Therefore, the effective costs of a branch instruction for a relatively rare condition is increasing.
On the other hand, optimizations are now being employed to effectively treat the most-likely path of a program as a single unit. Less-likely paths of the program are treated similar to error conditions. One advantage of this optimization technique is that it permits much better scheduling of hardware resources on the most-likely program path. This kind of less-likely path xe2x80x9cerror conditionxe2x80x9d is infrequent, but not nearly as infrequent as the traditional kind of error condition handled by the signal handler approach. The signal handler approach involves a tremendous amount of overhead and is only effective when the likelihood of the error is not merely small, but miniscule.
In view of the above, there is a need for an improved mechanism for changing control flow without branches for infrequent, but not ultimately rare, situations. A computer system employing such an improved control flow change mechanism is desired to permit better code scheduling, to avoid misprediction problems, and to avoid complications resulting from multiple branches per cycle. Furthermore, there is a desire that such an improved control flow change mechanism be much faster, when invoked, than the high overhead signal handler approach, so that the improved control flow change mechanism""s performance cost is not significant.
The act of executing, or specifying the execution of, an instruction before control has been passed to the instruction is called xe2x80x9cspeculation.xe2x80x9d Speculation performed by the processor at program runtime is called xe2x80x9cdynamic speculationxe2x80x9d while speculation specified by the compiler is called xe2x80x9cstatic speculation.xe2x80x9d Dynamic speculation is known in the prior art. While the vast majority of the prior art is not based on, and does not refer to, static speculation, recently some references to static speculation have begun to surface.
Two instructions are called xe2x80x9cindependentxe2x80x9d when one does not require the result of the other; when one instruction does require the result of the other the instructions are called xe2x80x9cdependent.xe2x80x9d Independent instructions may be executed in parallel while dependent instructions must be executed in serial fashion. Program performance is improved by identifying independent instructions and executing as many of them in parallel as possible. Experience indicates that more independent instructions can be found by searching across multiple basic blocks than can be found by searching only within individual basic blocks. However, simultaneously executing instructions from multiple basic blocks requires speculation.
Identifying and scheduling independent instructions, and thereby increasing performance, is one of the primary tasks of compilers and processors. The trend in compiler and processor design has been to increase the scope of the search for independent instructions in each successive generation. In prior art instruction sets, an instruction that may generate an exception cannot be speculated by the compiler since, if the instruction causes an exception, the program may exhibit erroneous behavior. This restricts the useful scope of the compiler""s search for independent instructions and makes it necessary for speculation to be performed at program runtime by the processor via dynamic speculation. However, dynamic speculation entails a significant amount of hardware complexity that increases exponentially with the number of basic blocks over which dynamic speculation is appliedxe2x80x94this places a practical limit on the scope of dynamic speculation. By contrast, the scope over which the compiler can search for independent instructions is much largerxe2x80x94potentially the entire program. Furthermore, once the compiler has been designed to perform static speculation across a single basic block boundary, very little additional complexity is incurred by statically speculating across several basic block boundaries.
If static speculation is to be undertaken, then several problems must be solved, one of the most important of which is the handling of exceptional conditions encountered by statically speculated instructions. Since, as noted above, exceptions on speculative instructions cannot be delivered at the time of execution of the instructions, a compiler-visible mechanism is desired to defer the delivery of the exceptions until control is passed to the basic block from which the instructions were speculated (known as the xe2x80x9coriginating basic blockxe2x80x9d). Mechanisms that perform a similar function exist in the prior art for deferring and later delivering exceptions on dynamically speculated instructions. However, by definition the mechanisms are not visible to the compiler and therefore cannot be manipulated by the compiler into playing a role in compiler-directed speculation. No known method or apparatus for deferring and later delivering fatal and non-fatal exceptions on statically speculated instructions has been enabled in the prior art. Limited forms of static speculation do exist in the prior art, however: (1) the forms do not involve deferral and later recovery of exceptional conditions, and (2) the forms do not enable static speculation over the breadth and scope of the present invention.
Therefore, when undertaking static speculation, there is a need in the art for a mechanism to handle exceptions on speculative instructions such that any side effects of the speculative instructions are not visible to the programmer. Further, the mechanism should apply to as many forms of static speculation as possible.
There is also a need for a mechanism to achieve higher performance in computer systems by enabling execution of as many independent instructions in parallel as possible. This is desirable even when there is a possibility that a second instruction, as well as a calculation dependent thereon, may operate upon data that can be dependent upon the execution of a first instruction.
There is also a need for an improved mechanism in computer systems for recovery of an exception caused by advanced or speculatively executed instructions which includes an improved mechanism for effecting control flow change which does not involve branches to allow for better code scheduling, to avoid misprediction problems, and to avoid complications resulting from multiple branches per cycle. Such an improved recovery mechanism is desired which is relatively fast, when invoked, so that it improves performance of the computer system""s recovery from the exception caused by advanced or speculatively executed instructions.
The present invention provides a method and a computer system including a memory and a processor. The memory stores a program and an interruption handler. The program has instructions including a trampoline check instruction and a special handler for handling an interruption. The processor executes the program and the interruption handler. The processor includes an instruction pointer indicating a memory location of a current executing instruction. The processor executes the trampoline check instruction which tests a condition and if the condition is true, causes the interruption and supplies an address displacement. The interruption handler responds to the interruption and restarts execution of the program at a restart point indicating a memory location of the special handler. The restart point is equal to a sum of the address displacement and a value of the instruction pointer at the time of the interruption.
In one embodiment, the interruption handler adds the address displacement to the value of the instruction pointer at the time of the interruption to obtain the restart point in response to the interruption. In an alternative embodiment, the processor includes hardware for adding the address displacement to the value of the instruction pointer at the time of the interruption to obtain the restart point and an interruption control register for capturing the restart point. In the alternative embodiment, the interruption handler obtains the restart point from the interruption control register in response to the interruption.
In one embodiment, the processor includes an interruption control register for capturing the address displacement. In one embodiment, the memory stores an interruption vector table including an interruption vector for supplying information related to the interruption. The interruption vector is preferably not shared with other interruptions so that the interruption handler does not require additional decoding software to determine the cause of the interruption. In one embodiment, the memory stores an operating system for controlling the processor and the memory and the interruption handler is part of the operating system.
In one embodiment, if the condition is false, normal control flow of the program is continued.
In one embodiment, the processor executes the special handler of the program to handle the interruption. After the special handler handles the interruption, the processor executes a branch instruction to branch back to a portion of the program that was executing at the time of the interruption.
In one embodiment, the program instructions further include a store instruction and a load instruction that is scheduled before the store instruction. The condition is true if the store instruction and the load instruction access a common location in the memory. In one embodiment, the program instructions further include at least one calculation instruction that is dependent on data read by the load instruction. The at least one calculation instruction is scheduled ahead of the store instruction. The special handler comprises recovery code including code for re-execution of the load instruction and the at least one calculation instruction.
In one embodiment, the program instructions further include a first instruction and a second instruction. The second instruction is scheduled ahead of the first instruction. The condition is true if the second instruction operates upon data that is dependent upon the execution of the first instruction. The special handler comprises recovery code including code for re-execution of the second instruction.
In one embodiment, the program instructions further include at least one instruction marked as speculative. The condition is true if integrity of execution of the at least one instruction marked as speculative is not verified. The special handler comprises recovery code including code for re-execution of the at least one instruction marked as speculative.
In one embodiment, the program instructions are organized in a plurality of basic blocks. Each basic block including a set of contiguous instructions. The program instructions including a first instruction that is associated with a first basic block and is capable of generating an exception during execution of the program. The first instruction is scheduled outside of the first basic block and ahead of at least one instruction that precedes the first basic block. The condition is true if the first instruction generated an exception. In one embodiment, the trampoline check instruction is scheduled within the first basic block. The special handler comprises recovery code including code for re-execution of the first instruction.
In one embodiment, the program instructions further include a first speculative instruction that is capable of experiencing an instruction exception condition during execution of the first speculative instruction. The first speculative instruction defers signaling an instruction exception when the instruction exception condition is initially detected and completes execution without signaling the instruction exception. The condition is true if the instruction exception was detected during execution of the first speculative instruction. The special handler comprises recovery code including code for re-execution of the first speculative instruction.
The computer system of the present invention includes a trampoline mechanism for changing control flow to emulate a branch instruction. The trampoline mechanism permits better code scheduling, avoids branch misprediction problems, and avoids complications resulting from multiple branches per cycle. The trampoline mechanism according to the present invention is much faster, when invoked, than the high overhead signal handler approach, so the trampoline mechanism""s performance cost is not significant. Therefore, the trampoline mechanism according to the present invention is quite useful for handling conditions which are infrequently true, but are not almost never true. Moreover, the trampoline mechanism according to the present invention can be employed for recovery of an exception caused by advanced or speculatively executed instructions.