Code instrumentation is a method for analyzing and evaluating program code performance. In one approach to code instrumentation, new instructions (or probe code) are added to the program, and, consequently, the original code in the program is changed and/or relocated. Some examples of probe code include adding values to a register, moving the content of one register to another register, moving the address of some data to some registers, etc. The changed and/or relocated code is referred to as instrumented code or, more generally, as an instrumented process. For purposes of the present discussion, instrumented code is one type of dynamically generated code. Although the following discussion explicitly recites and discusses code instrumentation, such discussion and examples are for illustration only. That is, the following discussion also applies to various other types of dynamically generated code.
One specific type of code instrumentation is referred to as dynamic binary instrumentation. Dynamic binary instrumentation allows program instructions to be changed on-the-fly. Measurements such as basic-block coverage and function invocation counting can be accurately determined using dynamic binary instrumentation. Additionally, dynamic binary instrumentation, as opposed to static instrumentation, is performed at run-time of a program and only instruments those parts of an executable that are actually executed. This minimizes the overhead imposed by the instrumentation process itself. Furthermore, performance analysis tools based on dynamic binary instrumentation require no special preparation of an executable such as, for example, a modified build or link process.
Unfortunately, dynamic binary instrumentation does have some disadvantages associated therewith. For example, because the binary code of a program is modified when using dynamic binary instrumentation methods, all interactions with the processor and operating system may change significantly, for example a program's cache and paging behavior. As a result, dynamic binary instrumentation is considered to be intrusive. Also, due to the additional instructions introduced by dynamic binary instrumentation, process execution time can slow to anywhere from some small amount of increased run time to multiples of the run time of the non-instrumented process.
In one approach, dynamic binary instrumentation is performed in an in-line manner. That is, probe code is inserted into a code stream of interest. As a result, existing code must be relocated to new memory space because of increase in size of the original code stream due to the addition of probe code instructions. As compared to out-of-line approaches, an in-line approach leads to more compact code, less intrusion, and better performance. That is, in a typical out-of-line approach, a function's entry point is instrumented with a long branch to a trampoline that executes the instruction plus additional code related to the instrumentation taking place. In the in-line approach, such long branching to the trampoline is avoided. However, an in-line strategy does have drawbacks. For example, the insertion of probe code changes the relative offsets in a code stream and requires lookup of indirect branches (e.g. in a translation table) whose target cannot be determined by the instrumentor. Also, combining different instrumentations and probe code is not as easy as it is in certain out-of-line approaches. One drawback associated with in-line instrumented processes is problem is particularly troublesome. Namely, in some instances it is desirable or necessary to reverse the dynamic binary in-line instrumentation operation, i.e., to undo the instrumentation and revert back to executing the original code. For example, “undoing” the instrumentation (i.e. uninstrumenting a process) is useful when an application is to be measured for only a part of its total runtime.
During some uninstrumentation operations, it is often necessary to unwind the call stack. Furthermore, in certain architectures such as, for example, an IA-64 architecture by Intel Corporation of Santa Clara, Calif., the runtime architecture uses unwind information to perform the task of unwinding. As mentioned above, during in-line instrumentation, the insertion of probe code changes the relative offsets in a code stream. As a result, unwind descriptors that were generated by the compiler for the original function may not match the instrumented function to be unwound due to the insertion of the probe code. Therefore, in prior approaches, the unwind descriptors for the instrumented function must either be updated or new unwind descriptors must be generated. In one effective attempt to resolve this issue, pseudo-modules are created. These pseudo-modules contain data about the dynamically generated code (e.g. the instrumented code) and the corresponding unwind information. The pseudo-modules are utilized by the software component seeking to register an instrumented function along with its unwind information. This registration, enabled by the pseudo-modules, in a centralized place allows easy and effective synchronization and eliminates the need to update unwind tables. Furthermore, in one such effective approach, an application program interface invocation code sequence is coupled to the dynamically generated code. The application program interface invocation code sequence operates in conjunction with the application program interface to facilitate the use of the pseudo-modules during registration of the unwind information.
The above-described approach does have certain drawbacks associated therewith. Specifically, such an approach is performed for all dynamically generated code and its corresponding unwind information. Hence, considerable overhead is introduced by the above-described approach.
Thus, a need exists for a method and system which reduces the amount of overhead associated with the registration of dynamically generated code and its corresponding unwind information.