The computing community has developed tools and methods to analyze the run-time behavior of a computer program. Many of the tools and methods use statistical sampling and binary instrumentation techniques. Statistical sampling is performed by recording periodic snapshots of the program's state, e.g., the program's instruction pointer. Sampling imposes a low overhead on a program's run time performance, is relatively non-intrusive, and imprecise. For example, it may be difficult to associate a sampled instruction pointer with the particular instruction that caused the latest sampling event.
While binary instrumentation leads to more precise results, the accuracy comes at some cost to the run-time performance of the instrumented program. Traditional binary instrumentation is static. It involves rewriting the whole program before any run to insert data-gathering code. Because the binary code of a program is modified, all interactions with the processor and the operating system can change significantly. Consequently, binary instrumentation is considered intrusive.
Dynamic binary instrumentation allows program instructions to be changed on-the-fly and leads to a whole class of more precise run-time monitoring results. Unlike static binary instrumentation techniques that are applied over an entire program prior to execution of the program, dynamic binary instrumentation is performed at run-time of a program and only instruments those portions of an executable that are executed. Consequently, dynamic binary instrumentation techniques can significantly reduce the overhead imposed by the instrumentation process.
Software development tools can combine statistical sampling and dynamic binary instrumentation methods into a framework that enables performance analysis, profiling, coverage analysis, correctness checking, and testing of a program.
A basic reason for the difficulty in testing the correctness and performance of a program is that program behavior largely depends on the data on which the program operates and, in the case of interactive programs, on the information (data and commands) received from a user. Therefore, even if exhaustive testing is impossible, as is often the case, program testing and performance analysis is preferably conducted by causing the program to operate with some data. The act of executing a program entails the creation of one or multiple “processes.”
A “process” is commonly defined as an address space, one or multiple control threads operating within the address space, and the set of system resources needed for operating with the threads. Therefore, a “process” is a logic entity consisting of the program itself, the data on which it must operate, the memory resources, and input/output resources. Executing a given program may lead to multiple processes being created. Program verification and performance testing encompasses execution of the program to test if the process or processes develop in the correct way or if undesired or unexpected events occur.
Generally, software development tools use two basic techniques to monitor the execution of a given process, in-process monitoring and out-of-process monitoring. In-process monitoring involves modifying a program to be tested so that select program instructions are preceded and followed by overhead instructions that extract variable information, control execution of the instruction, and monitor program execution. With out-of-process monitoring, a monitoring program executes in a different process and interacts with the monitored processes.
Symbolic debuggers constitute an example of out-of-process monitoring. Symbolic debuggers are interactive programs which allow a software engineer to monitor the execution of a compiled program. The user can follow the execution of a compiled program in a familiar high-level programming language while the program to be tested executes. Symbolic debuggers modify an executable copy of the source by selectively inserting conditional branches to other routines, instruction sequences, and break points. The compiled and instrumented program can then be run under the control of a managing program or a software engineer via a human-machine interface.
Symbolic debuggers also enable the insertion of instruction sequences for recording variables used in execution of the instruction and, on user request, can add and remove break points, modify variables, and permit modification of the hardware environment. These techniques are particularly effective in that they permit step-by-step control of the execution of a program, that is, they allow the evolution of the related process to be controlled by halting and restarting the process at will and by changing parameters during the course of execution of the process. The tools also can display the execution status of the process to the software engineer in detail by means of display windows or other output devices that enable the user to continuously monitor the program. Some tools automate the process of setting break points in the executable version of the source code.
Symbolic debuggers have several limitations. First, they operate on only a single process at a time. Second, the process to be tested must be activated by the parent process (the symbolic analysis process) and cannot be activated earlier. Consequently, the debugging of programs, which are activated at system start up, such as monitors, daemons, etc., is problematic.
Furthermore, because the process to be tested is generated as a child of the symbolic analysis parent, and in a certain sense is the result of a combination of the symbolic analysis function/program with the program to be tested, the two processes must share or utilize the same resources. As a consequence, interactive programs that use masks and windows on a display device cannot be tested because they compete or interfere with the symbolic debugger in requiring access to the display device.
Moreover, software development tools that use symbolic debuggers can encounter deadlock conditions that result from the standard execution of operating system level instructions.
One operating system that has gained widespread acceptance is the UNIX® operating system. UNIX® is a trademark of the American Telephone and Telegraph Company of New York, N.Y., U.S.A. The UNIX® operating system is a multi-user, time-sharing operating system with a tree-structured file system. Other noteworthy functional features are its logical I/O capabilities, pipes, and forks. The logical I/O capabilities allow a user to specify the input and output files of a program at runtime rather than at compile time, thus providing greater flexibility. Piping is a feature that enables buffering of input and output data to and from other processes. Forking is a feature that enables the creation of a new process.
By themselves, these features offer no inherent benefits. However, the UNIX® operating system command environment (called the SHELL) provides easy access to these operating system capabilities and also allows them to be used in different combinations. With the proper selection and ordering of system commands, logical I/O, pipes, and forks, a user at the command level can accomplish tasks that on other operating systems would require writing and generating an entirely new program. This ability to easily create application program equivalents from command level is one of the unique and primary benefits of the UNIX® operating system.
The popularity of the UNIX® operating system has led to the creation of numerous open source and proprietary variations such as LINUX®, HP-UX®, etc. LINUX® is a trademark of William R. Della-Croce, Jr. (individual) of Boston, Mass., U.S.A. HP-UX®, is a trademark of the Hewlett-Packard Company, of Palo Alto, Calif., U.S.A. Variants of the UNIX® operating system inherently use the UNIX® operating system's logical I/O capabilities, pipes, and forks.
Software development tools can encounter a deadlock condition when a process under test includes a “vfork” system call. The operation of a “vfork” system call in the UNIX® operating system involves spawning a new process and copying the process image of the parent (the process making the vfork call) to the child process (the newly spawned process).
Monitoring facilities enable a process or thread to control the execution of threads running in another process. Generally, monitoring facilities control other threads by reading and modifying the state of the process. “Thread trace,” also known as “ttrace” is a tracing facility for single and multithreaded processes. Ttrace is an evolution of “process trace,” also known as “ptrace.” A monitoring facility typically allows the monitoring program to declare an interest in the occurrence of particular events associated with any thread or for a specific subset of threads. For example, the monitoring process may want to be informed when a thread receives a signal, invokes a system call, or executes a breakpoint. While under the control of a monitoring facility, the monitored code behaves normally until one of those events occur. At this point, the thread or process enters a stopped or suspended state and the tracing process is notified of the event. In the ttrace facility, the monitoring process receives such notifications by invoking the system call ttrace_wait.
Ttrace_wait can be called in blocking or non-blocking modes. When called in blocking mode, the system call will not return until an event is available. In non-blocking mode the system call will return promptly but may indicate that no event notification is available. When event notification is available, ttrace_wait will provide an indication of the event type encountered.
At a given point in time, a monitoring program may monitor multiple processes, each process including one or multiple threads. Monitoring facilities such as ttrace typically provide a way to separate event notifications from the various processes, or even from the various threads in a given process. In the case of ttrace_wait, the monitoring process will either return events from any process monitored, from any thread in a specified process, or from a specified thread in a specified process. Using the mode where a single ttrace_wait call will provide information about any process monitored can introduce a bottleneck or make run-time analysis too complex.
A vfork system call differs from a fork system call in that the child process created via the vfork system call shares the same address space as the parent until exec or _exit is called, while a child process created via the fork system call gets a copy of the parent address space. In some operating systems, the vfork system call is implemented by having the parent process blocked until the child process calls exec or _exit. This allows for a simple implementation of vfork, as the child process can be created cheaply as it directly uses most of the structures associated with the parent process.
As mentioned above, such an implementation of the vfork system call can present a deadlock condition for debuggers that use known tracing facilities. If the monitoring process is using monitoring facilities that in turn are using per-process notifications, the monitoring process will not be aware of the newly created child process, until the monitoring process is notified that the child process exists. In some existing designs, however, notification is not delivered until the parent process is unblocked. That is, when the child process calls exec or _exit. Furthermore, the child process itself can be blocked as it generates monitoring events. This results in a deadlock, as both the traced parent and child process are blocked while the tracing process waits for an indication of an event that cannot be delivered until the child process is unblocked.
FIG. 1 illustrates the deadlock condition. The process descriptions or blocks in the flow chart presented in FIG. 1 should be understood to represent a somewhat inaccurate overview of an instrumentation process. Those reasonably skilled in the art will understand that an accurate depiction of instrumenting a parent process that executes a vfork to spawn a child process would require a detailed unified modeling language (UML) sequence diagram to reflect the actual interactions in the process.
Deadlock condition 10 occurs between development tool 20, parent process 30, and child process 40 as described below. Development tool 20 identifies a process (e.g., parent process 30) that the development tool 20 would like to monitor as indicated in block 22. Next, development tool 20 spawns parent process 30 under trace control as shown in block 24. A process identifier (PID) is assigned to the parent process 30 by the operating system when the parent process is created.
Thereafter, in block 26, development tool 20 monitors execution of the parent process by monitoring trace events from parent process 30. Under the UNIX® operating system and its open source and proprietary variants, development tool 20 waits for trace events that include the PID of the parent process. Development tool 20 cannot monitor child process 40, since child process 40 has not been created.
Once parent process 30 is created and started, parent process 30 is instrumented and runs nominally in accordance with its instructions as shown in block 32. The parent process 30 runs until it encounters a vfork system call as shown in block 34. Next, parent process 30 spawns child “A” under trace control as shown in block 36. In accordance with the vfork system call, the operating system copies the current state of the parent process 30 to spawn child process 40 and generates a trace event which is received by development tool 20. Thereafter, as shown in block 38, parent process 30 is essentially suspended waiting for an indication that child process 40 has completed (e.g., indicia of an exec or exit).
Once child process 40 is created by the vfork system call in parent process 30, and started, child process 40 is instrumented and runs nominally in accordance with its instructions as shown in block 42. The child process 40 runs nominally in accordance with its instructions until it encounters the vfork system call shown in block 44. Thereafter, as shown in block 46, child process 40 spawns a child “B” process. At this time, the operating system assigns a PID, different from the parent PID and child “A” PID, to identify the subsequent child process. In accordance with the vfork system call, child process 40 copies itself in its instrumented state to spawn the subsequent child process (not shown) and generates a trace event which is ignored by development tool 20 because development tool 20 is only looking for trace events from parent process 30.
Once the vfork system call is encountered and processed in child process 40, the deadlock condition has occurred. Parent process 30 is suspended waiting for an indication that child process 40 has completed. Child process 40, which inherited trace control from parent process 30, waits for a process to handle the trace event generated at the time it executed the vfork system call. Concurrently, development tool 20 waits for a trace event from parent process 30.
Consequently, it is desirable to have an improved apparatus, program, and method to avoid deadlock induced by a vfork system call when monitoring application programming interfaces (APIs) to track the execution of computer programs.