For performance monitoring, profiling, and many other reasons, it is useful to walk the call stack of a program, process, or thread (hereafter “program”).
An executing program will have many functions, procedures, or methods (hereafter, “procedures”) that are called in order to carry out the program's purposes. As the program executes, procedures call other procedures, and at any given time of execution, a call stack contains state information about a call chain. Since the same procedure may be called from multiple other procedures, it is valuable in performance monitoring to determine which procedure called a procedure.
For example, the program (i.e., “Main( )”) shown in Table 1, contains four procedures—NewRecord( ), EditName( ), EditAddress( ), and UpdateRecord( ). In this program EditName( ) can be called from two different places in Main( ), namely NewRecord( ) and UpdateRecord( ). While monitoring Main( ), it can be useful to know whether EditName( ) is called more often from NewRecord( ) or UpdateRecord( ).
TABLE 1Main( ) NewRecord( )  EditName( )  EditAddress( ) UpdateRecord( )  EditName( )
During monitoring Main( ) it may be determined that EditName( ) is taking too much time. This information is useful in determining which procedures need to be further optimized. In order to determine which procedures are calling EditName( ), the call stack can be walked upon entering EditName( ).
Table 2 is an example of a call chain from Main( ) to EditName( ). The call chain shows that Main( ) called UpdateRecord( ), and UpdateRecord( ) called EditName( ). It is not unusual for a call chain to have tens or hundreds of procedures.
TABLE 2Enter Main( ) Enter UpdateRecord( )  Enter EditName( )
In limited cases, a profiler can be used to assemble information about call chains. For example, if the source code is available, a compiler can be used to insert code that reports call chain information. In another case, if debug information is provided about a procedure, a binary injection tool can be used to instrument a procedure. If debug information is available, code can be injected into the binary image of the procedure, and the injected code will report when the procedure is entered and or exited. For example, if all the procedures in Main( ) are instrumented, an example executing call chain could appear as shown in Table 3.
TABLE 3Enter Main( ) Enter UpdateRecord( )  Enter EditName( )  Exit EditName( ) Exit UpdateRecord( )Exit Main( )
A program may contain hundreds or thousands of procedures, some of which may be obtained from third party sources. If the instrumentation is done with a compiler, the source code may not be available from these third party sources. If the instrumentation is done with a code injector, debug information (i.e., “PDB”) is needed about the procedures. Again, this information may not be available if procedure are obtained from third parties. For many reasons, the source code and or this PDB information may not be available, or may be impractical to obtain. Thus, a call chain can not be reliably obtained in many circumstances.
Another technique for obtaining a call chain, involves sampling a program's state and then walking the call stack to obtain call chain information. However, this traditional method requires the stack to have a chain of base pointers, or requires using the PDB information to identify the return addresses within the call stack frames.
A stack is a region of memory where programs store status data such as procedure addresses, passed parameters, and local variables. FIG. 1 is an example stack 100. In these examples, the top of the stack is a low address, so the stack grows downward as data is pushed onto the stack. A stack includes return addresses 102, which tell the processor where to return upon completing execution in the present active frame. The stack may also contain input parameters 104 received by the procedure as input and possibly local variables 106. The stack may also contain a frame pointer 108, which points 110 to the previous stack frame. In some cases, each stack frame may contain a frame pointer 112 to the previous stack frame base. In one example, a traditional stack walker follows the frame pointers to walk the stack 110, 112.
However, in some cases, the stack frames created by a procedure are further optimized, so that they do not contain the frame pointers. In such a case, the stack can be walked using the PDB (debug) information to determine the stack frame contents. For example, FIG. 1 shows three stack frames 114, 116, 118. The top of the stack contains the activated frame 118, and the stack pointer 120 points to the top of the stack 100. Each stack frame 114, 116, 118 is specific to a procedure in the call chain, and thus may vary in size based on the content required by the procedure it represents (e.g., passed parameters, local variables, etc.) By using the PDB information obtained at debug or compile time, a stack walker can determine the depth, contents, or offsets for a stack frame for each compiled procedure. From this information, it may be possible to walk the stack. However, the PDB information is not available in many cases.
In practice, many of the procedures used by a program developer are obtained from third party sources. These third party sources are often unwilling to provide source code or PDB information. In other cases, these third party sources use proprietary systems to create PDB information, and are unwilling to disclose anything that may jeopardize a competitive advantage which may become apparent from viewing this PDB information. In other cases, a program may contain stack frames created by procedures obtained from multiple sources. These multiple sources may be unwilling to agree on whether to include frame pointers in stack frames. Thus, a program may contain some procedures with PDB information, other procedures with frame pointers, and likely, some procedures with neither. Finally, even if all the PDB information is available, many scenarios occur when prior art stack walkers fail. Thus, the prior art stack walkers are unable to reliably walk the stack.
What is needed is a way to walk the stack without relying on PDB information. More specifically, what is needed is a way to walk the stack using the image that is currently executing (i.e., the binary code of the presently executing procedure), the stack, and the instruction pointer (i.e., the pointer to the instruction executing when the program was frozen or interrupted).
A forward code walking technique described herein produces call chain information from the call stack. In one respect, the technique uses an instruction pointer, a stack pointer, a binary image, and a call stack, to obtain a call chain for an interrupted program.
In a further respect, a technique walks forward through the binary code of a procedure (i.e. a binary image for the procedure) to identify a return instruction. While walking forward through the binary image, the technique identifies a set of instructions that alter the distance from the top of the stack to a return address on the stack. After calculating distance variables based on the set of instructions, the technique uses the distance variables to update the stack pointer and the instruction pointer. The updated instruction pointer points to the procedure that called this procedure. The technique then walks forward through the binary image of the procedure that called this procedure. This continues until the stack is empty. A list of instruction pointers are returned as a call chain.
In yet another respect, a system for profiling call chains for an interrupted executing program is discussed. Upon each interrupt, the state information for the executing process is saved. A method walks forward through the executable instructions representing each procedure in the call chain to a return instruction in each procedure. The executable instructions encountered in the forward walk are analyzed and used to locate the return address to each procedure's corresponding calling procedure in the call chain. When the return addresses for all calling procedures in the chain have been located, a list of return addresses are returned.
In another respect, a distance data structure is computed for each stack frame in the call chain, and the distance data structure is used to update the stack pointer and instruction pointer to the context of the calling procedures stack frame. In one implementation, the optimization includes caching a list of instruction pointer addresses with already computed distance structures. This optimization allows reusing a distance structure for an already computed instruction pointer address. This optimization is valuable in profiling when a return address may appear multiple times in a call chain, or multiple call chains may contain the same return address. A further optimization, includes storing ranges of addresses that have the same distance structures.
Additional features and advantages will be made apparent from the following detailed description of the illustrated embodiment which proceeds with reference to the accompanying drawings.