1. Field of the Invention
This invention relates generally to a method and apparatus for debugging and testing computer software and, more particularly, to a technique that allows a user to selectively replay portions of a previously executed computer program.
2. Discussion of the Related Art
During the development of a computer program, it often is desirable to test the computer program by providing test data, and then executing the computer program to generate test results. A user then may evaluate the test results and modify the computer program based upon the results. During this process, the user may desire to run at least a portion of the computer program several times to assist, for example, in isolating the suspected cause of an error. When, however, a computer program is very large or has a very long execution time, it may be impractical to replay the entire program for evaluation and debugging. Therefore, a technique called incremental replay was developed which allows a user to select and replay only a portion of the computer program's execution.
To provide effective support for debugging and testing computer programs, a replay apparatus must incur low setup and replay times, it must interfere with and slow down the program's execution as little as possible, and it must not require large amounts of data to be stored. Setup time is the time required to prepare for replay after an original program has been executed at least once. Replay time is the time required to actually re-execute the instructions associated with the desired portion of the computer program. Some approaches minimize setup time while other approaches minimize replay time. During the setup time, variables or memory locations typically are set to values that are accurate for the portion of the computer program that is to be replayed. Typically, this practice includes providing, during or prior to replay, the same values to the memory locations that were present during the initial execution of the program.
Interference with the original (first executed) program often is caused by inserting instrumentation instructions into the original program to facilitate replay. If there are too many of such instrumentation instructions or if such instructions are too intrusive, then the original program may be slowed down too much during the initial execution or during the replay.
Because replay generally involves some storage during an initial execution of a program, as mentioned above, another consideration is the amount of data to be stored. Although enough data should be stored to accurately replay the desired portions of the original program execution, too much stored data will require excess storage resources as well as unnecessary overhead functions to manage the stored data.
One approach to incremental replay is described by Stuart I. Feldman and Channing B. Brown in, "IGOR: A System for Program Debugging via Reversible Execution", Proceedings of the SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, Madison Wis. (May 1988). The IGOR system described uses a virtual memory system of a computer to periodically trace, at fixed time intervals, the pages of the virtual memory system that were modified by a computer program since a previous checkpoint of the computer program. The term "trace," as used herein, refers to storing data in a memory so that such data is available for later use such as resetting variables and memory locations. To restart the program's execution from an intermediate point requires that a replay tool search the stored trace to find the most recent trace of each page. Because checkpoints are taken at fixed time intervals, the IGOR system bounds replay time, i.e. the amount of time required to replay up to a desired portion of the original program execution. However, setting up the state for the replay may require searching through an entire trace file, which may involve significant time and resources. Although this approach is adaptive in that it traces only pages that have been recently written to, tracing the entire contents of pages that have been written to since the last checkpoint can require large amounts of storage.
Another approach to tracing is described by Barton P. Miller and Jong-Deok Choi in, "A Mechanism for Efficient Debugging of Parallel Programs," SIGPLAN Conference on Programming Language Design and Implementation, Atlanta Ga., (June 1988). With such a Parallel Program Debugger (PPD) as described in this and other publications, an analysis is performed at compile-time of a program to determine what and when to trace. In particular, the PPD writes a prelog on the entry of each procedure, with the prelog containing the values of the variables that the procedure might possibly read. The prelog allows a procedure to be replayed alone since the prelog contains all variables necessary for the instructions of the procedure to properly execute. A postlog is written upon exit from the procedure, with the postlog containing the values of the variables that the procedure might have modified. The postlog allows the procedure to be skipped during replay since the postlog includes changes that the procedure might make.
A system such as PPD may result in storing much more data than is actually required to facilitate replay, because the analysis performed at compile-time must be conservative to assure that any replay will be accurate. Additionally, tracing only at procedure entry and exit may incur a large amount of intrusion during the initial execution of the program, and does not guarantee that replay up to a desired portion of the execution will be attainable in a predetermined amount of time. For example, a loop that is iterated several times and includes a procedure call may result in many needless traces. Conversely, a very long procedure call may not be traced often enough to replay any part of the procedure within time constraints that are acceptable to a user.
Another system, referred to as Spyder, is described by Hiralal Agrawal, Richard A DeMillo, and Eugene H. Spafford in, "An Execution-Backtracking Approach to Debugging," i IEEE Software, pp. 21-26 (May 1991). The Spyder system traces a "change set" before each statement or group of statements. The change set includes the values of the variables which might be modified by the statement or group of statements. A debugger then can backup execution over a statement by restoring the state from the associated change set. To bound the trace size, it is possible to store only the most recent change set for each statement. The Spyder system statically computes the change sets, and for programs that use pointers and arrays, the system must trace each such access. The Spyder system does not bound the time required to perform a replay, since it requires backing up to an instruction that is prior to the desired interval, and then progressing forward to perform the desired interval. A system such as Spyder also is limited by its static nature, in that it may have to trace every array or pointer reference for example.
A technique referred to as Demonic Memory is described by Paul Wilson and Thomas G. Hoher in, "Demonic Memory for Process Histories," Process of the SIGPLAN '89 PLDI Conference, pp. 330-343 (June 1989). Such a technique maintains a hierarchy of checkpoints, each checkpoint taken at successively larger time granularities, so that recent states can be reproduced relatively quickly while older states incur more delay. The Demonic Memory technique also includes a virtual snapshot as an approach to checkpointing, in which only the elements of a checkpoint that differ from a previous checkpoint are saved.
Previous techniques for checkpointing and replay are limited by their static nature, in that decisions are made at compile-time regarding what to trace, when to trace, or both. As a result, such techniques often trace more that what is necessary, and may also incur significant delays in replaying a desired portion of the computer program. It would be desirable to provide an approach that adaptively determines what information to trace and when to trace such information, and which minimizes interference with the computer program. It also would be desirable to allow a user to determine the amount of delay time to be incurred prior to a replay.