To fully utilize many data processors, an increasing number of executable machine codes (binaries) are being generated by compilers which incorporate advanced optimization techniques. With this increase, it has become a necessity to provide a clear, correct and effective way for programmers to debug highly optimized code.
There are two primary aspects associated with code optimization that make the debugging of optimized machine code difficult. First, optimization complicates the mapping between the source code and the machine code. Due to code duplication, elimination, and re-ordering caused by optimization, it is hard for the debugger to decide where in the machine code to set a breakpoint, when the user sets a source breakpoint, or which source line to report faults in when an execution exception occurs. See: T. Zellweger, "Interactive Source-Level Debugging of Optimized Programs", PhD thesis, Electrical Engineering and Computer Sciences, University of California, Berkeley, Calif. 94720, 1984. Second, it makes reporting values of source variables inconsistent with what the user expects (or even impossible). More specifically, the optimizations that are performed destroy the simple source-to-object correlation present in unoptimized codes. Hence when inspecting a halted program being debugged, there is generally no straight-forward answer to questions such as "Where am I?" and "What's happened so far?". Further, since variables may live in different locations at different points in the program (and indeed, at no locations at some points), reporting variable values becomes complicated. Much research in this area has concentrated on the second of these problems.
John Hennessy's seminal paper (J. Hennessy, "Symbolic Debugging of Optimized Code", ACM Transactions on Programming Languages and Systems, Vol. 4, pp. 323-344, July 1982) presented algorithms to detect variables whose values do not reflect the source program and examined the problem of recovering the correct values. These algorithms have been corrected and refined by others.
In the past decade, there have been several research works using different strategies to solve the problem of debugging optimized code. Hennessy (cited above) first introduced the concept of non-current variables and provided an algorithm to detect these variables. He also provided an algorithm to recover non-current variables in locally optimized code. Zellweger, "An Interactive High-Level Debugger for Control-Flow Optimized Programs", SIGPLAN Notices, pp. 159-171, August 1983) proposed and implemented a method to recover the expected behavior of a program by inserting path determiners (hidden breakpoints) into the program to enable the debugger to decide which execution path had been taken. The Zellweger method can only deal with code optimized by "function inlining" and "cross jumping".
Coutant et al., "Doc: A Practical Approach to Source-Level Debugging of Globally Optimized Code", Proceedings of the ACM SIGPLAN '88 Conference on Programming Language Design and Implementation, pp. 125-134, June 1988, modified an existing C compiler and a source-level symbolic debugger to support optimized code debugging. The optimizations they addressed are global register allocation, induction variable elimination, copy propagation, and instruction scheduling. The most noticeable part of their work is their solution for data value problems. Their compiler builds "range" data structures during optimization which provides the debugger with run-time locations of variables and recovery functions for eliminated variables.
Gupta, in "Debugging Code Reorganized by a Trace Scheduling Compiler", Structured Programming, pp. 141-150, July 1990 proposed an approach to debug code reorganized by a trace scheduling compiler. In the Gupta approach, the user has to specify the commands for monitoring values before compilation, and these commands are added and compiled into the program. At run time, the debugger stops when a monitor command is executed and reports the monitored information to the user.
Works done by Adl-Tabatabai and Gross focus on data value problems. They have proposed algorithms using data flow analysis to detect non-resident and endangered variables. Their methods provide limited capability to recover the expected value of endangered variables caused by local and global optimization. See: A. Adl-Tabatabai and T. Gross, "Evicted Variables and the Iteration of Global Register Allocation and Symbolic Debugging", in Conference Record of the 20th Annual ACM Symposium on Principles of Programming Languages, pp. 371-383, January 1993; A. Adl-Tabatabai and T. Gross, "Detection and Recovery of Endangered Variables Caused by Instruction Scheduling", in Proceedings of the ACM SIGPLAN '93 Conference on Programming Language Design and Implementation, pp. 13-25, June 1993; and Adl-Tabatabai, "Source-Level Debugging of Globally Optimized Code", PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pa. 15213, (1996).
In summary, the prior approaches to debugging optimized code have focused on making the user aware of the potentially surprising effects of optimization. While some attempt to recover the expected behavior of the original program, their capability has been limited.
Accordingly, it is an object of this invention to enable the debugging of optimized code without making the user aware of the effects of the optimization.
It is another object of this invention to provide a method and apparatus for the debugging of optimized code wherein actions specified in the source appear to take place in source order.
It is a further object of this invention to provide a method and apparatus for the debugging of optimized code wherein automatic bug detection is accomplished.