This invention relates generally to the field of computer software and, more particularly, to a fast conditional thunk utility employing an assembler-level direct-branch thunk technique.
A conditional thunk layer is a software operation that assists in an associated function call, typically by making a decision that affects the function call. In general, a conditional thunk layer immediately precedes its associated function call and makes a decision based on the status of a condition. The result of this decision affects the associated function call, typically by determining which version of the function to call.
For example, a software system may include more than one function that may be called to produce a desired result. In this case, the thunk layer is a precursor operation that determines which function to call. In another example, the software system includes a function that may be called with more than one set of arguments to produce a desired result. In this case, the thunk layer is a precursor operation that determines which arguments to include in the function call.
A first conventional approach implemented in the OFFICE 97 suite of programs, distributed by MICROSOFT Corporation, performs conditional thunk decisions through a xe2x80x9cC-levelxe2x80x9d conditional thunk function. This type of thunk layer is referred to as a xe2x80x9cthunk functionxe2x80x9d to indicate that the thunk decision is implemented through a xe2x80x9cC-levelxe2x80x9d function call. Because the thunk function immediately precedes its associated application program interface (API) function call, this approach results in a duplication of function calls.
Moreover, the added thunk function call alters the parameter passing utility, such as the stack, from the precise condition desired for the ensuing API function call. For this reason, the parameter passing utility must be returned to the condition desired for the API function call after the thunk function call. For a non-thunked function call requiring xe2x80x9cXxe2x80x9d instructions, this duplication of stack conditioning burdens the processor with xe2x80x9c2X+2xe2x80x9d instructions for a conditionally-thunked version of the function call. The OFFICE 97 solution checks the thunk condition for each API function called and, thus, is useful in circumstances in which the thunk condition can vary relatively frequently while the client program is running.
A second conventional approach implemented in the ACCESS 97 program, also distributed by MICROSOFT Corporation, performs conditional thunk decisions by loading a look-up table with a pointer to a memory address for each conditionally-thunked API function call. Each entry in the table points to the appropriate thunk code for the function call. That is, the thunk conditions are checked at start up or when the ACCESS 97 program is loaded, and the look-up table is populated with the appropriate pointers for the API function calls. The look-up table may be subsequently refreshed as needed to account for changes in the thunk conditions. Thus, the ACCESS 97 solution is typically used in circumstances in which the thunk conditions can be expected to change less frequently than the look-up table will be referenced. For example, the ACCESS 97 is particularly advantageous when the thunk conditions are invariant, or vary relatively infrequently, while the client program is running.
Because the look-up table is populated with pointers to API functions, this type of branch is referred to as an xe2x80x9cunconditional indirectxe2x80x9d branch. The branch is xe2x80x9cunconditionalxe2x80x9d because the look-up table contains only one possible target location for the jump. Thus, a condition does not have to be checked to select among alternative target locations for the jump at the time the branch is encountered. The branch is xe2x80x9cindirect,xe2x80x9d however, because the look-up table must be referenced to determine the target location for the jump. In other words, a pointer must be xe2x80x9cindirectedxe2x80x9d or xe2x80x9cde-referencedxe2x80x9d to determine the branch target.
The ACCESS 97 approach encounters drawbacks because typical processors, such as the INTEL PENTIUM PRO processor and related models, handle certain indirect branches inefficiently. More specifically, the processor cannot always determine where the branch will lead (i.e., the instruction at the target address of the indirect branch) until it reads the entry in the look-up table, which must be loaded from memory. This is because the processor utilizes a Branch Target Buffer (BTB) that includes only 512 entries. Although the processor may reference the BTB to correctly predict the target address for an encountered branch, the BTB only stores the target addresses for the most recent 512 branches encountered. The processor has no mechanism to correctly predict the target address for an indirect unconditional branch that is not among the most recent 512 branches encountered.
For relatively large program modules, such as MICROSOFT OFFICE, branches not reflected in the BTB may be encountered frequently. For these branches, the look-up table must be reference to determine the correct target address for the branch. While the entry is being read, the processor typically pre-processes future instructions using a prediction of the next instruction. As it turns out, this prediction is almost always wrong for branches that are not reflected in the BTB. This is because the next instruction will be located at the address from the look-up table, which has not yet been read. The processor""s pre-processing algorithm, on the other hand, usually predicts the next instruction to be the instruction immediately following the indirect branch instruction. This prediction is almost always incorrect because, if this was not the case, the look-up operation to determine the target address of the indirect branch would not have been necessary in the first place. The time elapsed until the successful execution of the instruction at the target location following an incorrect prediction leading to performing unwanted instructions is sometimes referred to as a xe2x80x9cprocessor stall.xe2x80x9d
Thus, there is a need for a conditional thunk methodology that avoids duplicative stack manipulation, such as that produced by using a xe2x80x9cC-levelxe2x80x9d function call to implement a conditional thunk routine. There is a further need in the art for a conditional thunk methodology that avoids an indirect branch, such as a reference to an address in a look-up table stored in system memory, to determine the target address of a conditional thunk routine.
The present invention meets the needs described above in a fast conditional thunk utility. The improved conditional thunk layer is referred to as a xe2x80x9cutilityxe2x80x9d to indicate that, unlike certain previous conditional thunk layers, it is not implemented using a xe2x80x9cC-levelxe2x80x9d function call. Nor is the conditional thunk layer implemented using a look-up table stored in system memory. Instead, the invention implements a conditional thunk utility through an assembler-level direct-branch technique.
The fast conditional thunk utility is typically employed when a client program, such as an application program, accesses an application program interface (API) exposed by a host program, such as the operating system. The primary advantage of the invention is a significant improvement in the processing speed or performance of the conditional thunk decision. Another advantage is a significant reduction in the size of the code required to implement the conditional thunk layer, which reduces the memory requirement of the client program.
In a first configuration, referred to as a xe2x80x9ccondition-checkxe2x80x9d alternative, the conditional thunk utility performs a condition check followed by a direct-branch jump. Because the condition-check methodology is implemented using an assembler-level direct-branch technique, the conditional thunk utility does not utilize the parameter passing utility, typically the stack, to queue the arguments of a function call. Thus, the stack is not altered from its desired condition just prior to executing the API function call. The condition-check alternative checks the thunk condition for each function call and, for this reason, may be used when the thunk condition can vary relatively frequently while the host computer system is running.
In a second configuration, referred to as a xe2x80x9cjump-tablexe2x80x9d alternative, the conditional thunk utility performs an assembler-level jump table check followed by a direct jump to a target address. The jump-table alternative retrieves the target address for the conditional thunk decision from the instruction cache (I-cache). As a result, this approach avoids incorrect prediction of pre-processing instructions executed while a jump address is retrieved from system memory, which caused processor stalls in previous conditional thunk layers. Because the thunk conditions are checked and the jump table is configured in advance, the jump-table alternative does not require a thunk condition check before each jump. For this reason, the jump-table alternative may be used for thunk conditions that are invariant, or vary relatively infrequently, while the client program is running
Generally described, the invention is a method for processing a function call in which a client program generates a function call associated with a conditionally thunked function to a host program. In connection with the function call, the client program pushes parameters onto a parameter passing utility. Without altering the condition of the parameter passing utility, the client program invokes an assembler-level direct-branch conditional thunk utility that selects a version of the function based on a condition of a host computer system. The client program then calls the selected version of the function. The host program reads the parameters from the parameter passing utility and performs the selected version of the function using the parameters read from the parameter passing utility.
According to an aspect of the invention, the conditional thunk utility selects a version of the function by checking a thunk condition. If the thunk condition is satisfied, the conditional thunk utility jumps to an address corresponding to a thunked version of the function. Alternatively, if the thunk condition is not satisfied, the conditional thunk utility jumps to an address corresponding to a non-thunked version of the function.
According to another aspect of the invention, before the client program receives a conditionally thunked function call, it checks a plurality of thunk conditions to obtain a thunk condition result for a number of functions. The client program then loads an assembler-level jump table with a target address for each function. In this jump table, the value of the target address associated with a particular function depends on the thunk condition result for that function. Upon invocation by the client program, the conditional thunk utility jumps to a previously-selected version of a called function by obtaining a target address associated with the function from the jump table located in an instruction cache memory. The conditional thunk utility then jumps to that target address.
In addition, checking a thunk condition may include determining whether the host computer system includes a predefined type of processor. Alternatively, checking a thunk condition may include determining whether the host computer system includes a predefined type of operating system. Or checking a thunk condition may include determining the status of a dynamic parameter of the host computer system, such as the status of the dynamic parameter that indicates whether a predefined dynamic link library (DLL) is resident in a random access memory component of the host computer system. In general, the thunk decision may be based on a wide variety of conditions, such as the type of printer installed, the settings of a printer or display driver, the type or state of a particular object, the state of user-defined program settings, whether certain program modules are installed or loaded in system memory, the keyboard language being used, and so forth.
That the invention improves over the drawbacks of prior conditional thunk layers and how it accomplishes the advantages described above will become apparent from the following detailed description of the exemplary embodiments and the appended drawings and claims.