Binary executable programs are “instrumented” or “profiled” to analyze program performance. The performance data that is gathered can be used to determine which source code might benefit most from improved coding. For example, if a particular function is called within a program loop and the loop is a hot spot during execution, it may be desirable to program the function in-line within the loop rather than as a function call.
For instrumentation to be useful, the semantics of the instrumented code must be the same as the semantics of the original, un-instrumented code. However, because instrumentation involves the insertion of probe instructions at various locations in the executable program, program semantics may be changed unless further precautions are taken.
In Hewlett Packard's IA-64 instruction architecture, instructions are grouped together in bundles by a scheduler within the compiler. The idea behind grouping instructions into bundles is to efficiently use the multiple functional units in an instruction processing unit in order to maximize instruction-level parallelism. The instructions in a bundle are dispersed to be executed in parallel. On the IA-64 architecture, each bundle has 3 slots for 3 instructions. A template field associated with each bundle restricts the type of instruction that can be issued from each slot. It will be appreciated that other very-long instruction word architectures also group multiple instructions into units which may be smaller or larger than the bundles described herein.
The IA-64 architecture includes a predicated branch-call instruction, which can be placed in any slot of a three-slot bundle. In a predicated branch-call instruction, the state of the predicate controls whether program control will be transferred to the target address of the branch-call instruction. If the branch is taken, upon completing execution of the code at the target, control is returned to the bundle that follows the bundle of the branch-call instruction. Thus, any instructions that follow the branch-call instruction in the branch-call's bundle are skipped if the branch is taken. If the branch is not taken, the instructions that follow the branch-call in the branch-call's bundle are executed. The instructions that follow the branch-call instruction in the branch-call's bundle are referred to as “call-shadow” instructions. If probe instructions are inserted in the proximity of a predicated branch-call instruction, the bundling of the branch-call and shadow instructions may change. That is, the branch-call instruction may reside in one bundle and the shadow instruction(s) in another bundle. In this case, if the branch-call instruction is taken, the call-shadow instruction(s) will be executed upon the return from the call target. Thus, the instrumented code will be semantically different from the un-instrumented code unless additional steps are taken.
A method and apparatus that address the aforementioned problems, as well as other related problems, are therefore desirable.