1. Field of Invention
The present invention relates generally to methods and apparatus for improving the performance of software applications. More particularly, the present invention relates to methods and apparatus for providing a debugging system with sufficient information to effectively debug optimized code.
2. Description of the Related Art
In an effort to increase the efficiency associated with the execution of computer programs, many computer programs are xe2x80x9coptimized.xe2x80x9d Optimizing a computer program generally serves to eliminate portions of computer code which are essentially unused. In addition, optimizing a computer program may restructure computational operations to allow overall computations to be performed more efficiently, thereby consuming fewer computer resources.
An optimizer is arranged to effectively transform a computer program, e.g., a computer program written in a programming language such as C++, FORTRAN, or Java Bytecodes into a faster program. The faster, or optimized, program generally includes substantially all the same, observable behaviors as the original, or pre-converted, computer program. Specifically, the optimized program includes the same mathematical behavior has its associated original program. However, the optimized program generally recreates the same mathematical behavior with fewer computations.
Typically, an optimizer includes a register allocator and a core optimizer. As will be appreciated by those skilled in the art, a register allocator moves computations from memory space into register space, while the core optimizer implements mathematical computations associated with the optimized program. In the course of creating an optimized program, an optimizer eliminates unused code. For example, codes associated with variables in an original program that are not used outside of a loop are generally eliminated. Such variables may include, but are not limited to, counter variables used as indexes within loops.
When an optimizer transforms a computer program, the optimizer often creates an internal representation of the computer program. The internal representation may then be used to generate machine code that is a computational equivalent of the computer program. FIG. 1 is a diagrammatic representation of an optimizer which transforms a computer program into an optimized computer program. A computer program 104, which may be written in any suitable computer programming language, is provided to an optimizer 110. As shown, computer program 104 includes a xe2x80x9cforxe2x80x9d loop 106 that includes a variable xe2x80x9ci.xe2x80x9d
Optimizer 110, which is effectively a compiler, includes an internal representation generator 114 and a machine code generator 118. Internal representation generator 114 takes computer program 104 as input, and produces an internal representation 122 of computer program 104. Internal representation generator 114 typically removes unused code, e.g., index variables such as variable xe2x80x9ci,xe2x80x9d such that internal representation 122 has no references to the unused code.
Internal representation 122 is provided as input to machine code generator 118, which produces machine code 126, i.e., a transformed computational equivalent of computer program 104. As internal representation 122 does not include references to the unused code, it should be appreciated that machine code 126 also does not include references to the unused code. By eliminating the unused code, machine code 126 may execute more efficiently than it would if the unused code were included.
Machine code 126, which represents a transformed or optimized version of computer program 104, is typically accessed by a debugger when machine code is to be debugged. While optimized code may be debugged for a variety of different reasons, optimized code is often debugged in order to identify errors which are only manifested in optimized code. Debugging may also occur to identify internal states associated with the code, as will be appreciated by those skilled in the art. FIG. 2 is a process flow diagram which illustrates the steps associated with optimizing a program and debugging the optimized program. A process 200 of optimizing and debugging a program begins at step 202 in which program code that contains an unused value, or variable, is obtained by an optimizer. Once the program code is obtained, an internal representation of the program code is generated in step 204. Generating an internal representation of the program code typically entails removing references to the unused value, as previously mentioned.
After the internal representation of the program code is created, machine code is generated from the internal representation in step 206. A debugger then accesses the machine code in step 208, and obtains available debugging information from the machine code. In general, debugging information includes state information at different points in the machine code. Such debugging information is generated by xe2x80x9cde-optimizingxe2x80x9d the optimized code. When unused code, e.g., a dead variable, is removed from an optimized program, that unused code generally may not be re-obtained during a debugging process. As such, a precise relationship between debugged code and optimized code either may not be obtained, or may be incorrect, as will be understood by those skilled in the art. In other words, the debugging information obtained may be inaccurate. Once the debugging information is obtained, the process of optimizing code and debugging the optimized code is completed.
In an environment with a virtual machine, e.g., a Java(trademark) virtual machine developed by Sun Microsystems, Inc. of Palo Alto, Calif., it may be desirable to convert optimized code to interpreted code. In order to accurately return optimized code to an interpreted equivalent, valid Java(trademark) virtual machine states are typically needed for all variables. Not all states may be available in the event that code pertaining to some states may have been removed during an optimization process. When such states are unavailable, the conversion to interpreted code generally may not occur at all, or may be inaccurate. Inaccuracy in a conversion may result in substantially incorrect results for the overall-computing environment.
Therefore, what is desired is an efficient method for obtaining debugging information from optimized code. That is, what is needed is a method and an apparatus for enabling states associated with unused values to be efficiently obtained during a debugging, or deoptimizing, process.
The present invention relates to providing a substantially full set of state information to a debugger, without significantly compromising system performance, in order to debug optimized computer program code. According to one aspect of the present invention, a method for obtaining information associated with program code includes adding a segment of code, which includes a representation that is effectively not used after it is computed, xe2x80x9cthe debugging codexe2x80x9d, to the program code. A xe2x80x9cbreak pointxe2x80x9d is chosen in proximity to the segment of code, and machine code is generated from the program code. Finally, the method includes replacing the instruction at the break point with a branch instruction that is arranged to cause the debugging code to execute. By executing the debugging code, states that would generally be eliminated in optimized machine code are available to a debugger or deoptimizer, thereby enabling the machine code to be accurately debugged or deoptimized.
In one embodiment, the segment of code is associated with a program loop. In such an embodiment, adding a break point in proximity to the segment of code may include integrating the break point into the program loop. The debugging code may further include code that calls a debugging function arranged to debug the program code.
According to another aspect of the present invention, a computer-implemented method for obtaining information associated with program code may include adding a call to a subroutine, i.e., the xe2x80x9cdebugging codexe2x80x9d, that is associated with the program code. The call to the subroutine includes a plurality of arguments where at least one of the arguments is a reference to a representation associated with a computation. The representation is essentially unused with respect to the program code and the subroutine. The computer-implemented method also includes generating machine code associated with the program code by substantially transforming the call to the subroutine into debugging code.
In yet another aspect of the present invention, a method for debugging optimized code includes generating a higher-level program representation that includes a loop section with an associated counter value and a segment of debugging code. The method also includes optimizing the higher-level program representation by converting the higher-level program representation into lower-level code that includes a section associated with the debugging code and a break point. The instruction at the breakpoint is replaced with a branch instruction that causes the section associated with the break point to execute. Finally, the debugging code is executed, thereby providing information associated with the counter value.
These and other advantages of the present invention will become apparent upon reading the following detailed descriptions and studying the various figures of the drawings.