This invention relates to a method and apparatus for debugging of optimized machine code and more particularly, to a method and apparatus for debugging optimized machine code wherein optimization effects on the machine code are made as transparent to the user as possible.
To fully utilize many data processors, an increasing number of executable machine codes (binaries) are being generated by compilers which incorporate advanced optimization techniques. With this increase, it has become a necessity to provide a clear, correct and effective way for programmers to debug highly optimized code.
There are two primary aspects associated with code optimization that make the debugging of optimized machine code difficult. First, optimization complicates the mapping between the source code and the machine code. Due to code duplication, elimination, and re-ordering caused by optimization, it is hard for the debugger to decide where in the machine code to set a breakpoint, when the user sets a source breakpoint, or which source line to report faults in when an execution exception occurs. See: T. Zellweger, xe2x80x9cInteractive Source-Level Debugging of Optimized Programsxe2x80x9d, PhD thesis, Electrical Engineering and Computer Sciences, University of California, Berkeley, Calif. 94720, 1984. Second, it makes reporting values of source variables inconsistent with what the user expects (or even impossible). More specifically, the optimizations that are performed destroy the simple source-to-object correlation present in unoptimized codes. Hence when inspecting a halted program being debugged, there is generally no straight-forward answer to questions such as xe2x80x9cWhere am I?xe2x80x9d and xe2x80x9cWhat""s happened so far?xe2x80x9d. Further, since variables may live in different locations at different points in the program (and indeed, at no locations at some points), reporting variable values becomes complicated. Much research in this area has concentrated on the second of these problems.
John Hennessy""s seminal paper (J. Hennessy, xe2x80x9cSymbolic Debugging of Optimized Codexe2x80x9d, ACM Transactions on Programming Languages and Systems, Vol. 4, pp. 323-344, July 1982) presented algorithms to detect variables whose values do not reflect the source program and examined the problem of recovering the correct values. These algorithms have been corrected and refined by others.
In the past decade, there have been several research works using different strategies to solve the problem of debugging optimized code. Hennessy (cited above) first introduced the concept of non-current variables and provided an algorithm to detect these variables. He also provided an algorithm to recover non-current variables in locally optimized code. Zellweger, xe2x80x9cAn Interactive High-Level Debugger for Control-Flow Optimized Programsxe2x80x9d, SIGPLAN Notices, pp. 159-171, August 1983) proposed and implemented a method to recover the expected behavior of a program by inserting path determiners (hidden breakpoints) into the program to enable the debugger to decide which execution path had been taken. The Zellweger method can only deal with code optimized by xe2x80x9cfunction inliningxe2x80x9d and xe2x80x9ccross jumpingxe2x80x9d.
Coutant et al., xe2x80x9cDoc: A Practical Approach to Source-Level Debugging of Globally Optimized Codexe2x80x9d, Proceedings of the ACM SIGPLAN ""88 Conference on Programming Language Design and Implementation, pp. 125-134, June 1988, modified an existing C compiler and a source-level symbolic debugger to support optimized code debugging. The optimizations they addressed are global register allocation, induction variable elimination, copy propagation, and instruction scheduling. The most noticeable part of their work is their solution for data value problems. Their compiler builds xe2x80x9crangexe2x80x9d data structures during optimization which provides the debugger with run-time locations of variables and recovery functions for eliminated variables.
Gupta, in xe2x80x9cDebugging Code Reorganized by a Trace Scheduling Compilerxe2x80x9d, Structured Programming, pp. 141-150, July 1990 proposed an approach to debug code reorganized by a trace scheduling compiler. In the Gupta approach, the user has to specify the commands for monitoring values before compilation, and these commands are added and compiled into the program. At run time, the debugger stops when a monitor command is executed and reports the monitored information to the user.
Works done by Adl-Tabatabai and Gross focus on data value problems. They have proposed algorithms using data flow analysis to detect non-resident and endangered variables. Their methods provide limited capability to recover the expected value of endangered variables caused by local and global optimization. See: A. Adl-Tabatabai and T. Gross, xe2x80x9cEvicted Variables and the Iteration of Global Register Allocation and Symbolic Debuggingxe2x80x9d, in Conference Record of the 20th Annual ACM Symposium on Principles of Programming Languages, pp. 371-383, January 1993; A. Adl-Tabatabai and T. Gross, xe2x80x9cDetection and Recovery of Endangered Variables Caused by Instruction Schedulingxe2x80x9d, in Proceedings of the ACM SIGPLAN ""93 Conference on Programming Language Design and Implementation, pp. 13-25, June 1993; and Adl-Tabatabai, xe2x80x9cSource-Level Debugging of Globally Optimized Codexe2x80x9d, PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pa. 15213, (1996).
In summary, the prior approaches to debugging optimized code have focused on making the user aware of the potentially surprising effects of optimization. While some attempt to recover the expected behavior of the original program, their capability has been limited.
Accordingly, it is an object of this invention to enable the debugging of optimized code without making the user aware of the affects of the optimization.
It is another object of this invention to provide a method and apparatus for the debugging of optimized code wherein actions specified in the source appear to take place in source order.
It is a further object of this invention to provide a method and apparatus for the debugging of optimized code wherein automatic bug detection is accomplished.
The invention is a method for debugging a machine code of a program that has been subjected to an optimizing action, wherein the machine code may have been reordered, duplicated, eliminated or transformed so as not to correspond with the program""s source code order. The method derives a table which associates each machine code instruction with a source construct for which it was generated. The user sets a breakpoint at a breakpoint P in the source code where execution is to stop. Then the method determines at least one corresponding location for the breakpoint in the machine code through use of the table, and executes, by native execution or emulation, only machine code instructions which correspond to source constructs that precede the breakpoint in the source code order. The method further enables a comparison of the results of two passes of emulation (in different orders) to detect a class of bugs that are particularly hard to find: bugs caused by optimizer errors and user bugs that manifest themselves only in the optimized executable.