1. Field of the Invention
The invention concerns analysis and control of systems and more nearly concerns analysis and control of systems made up of processes executing in computers.
2. Description of the Prior Art
An important part of building large systems is debugging, that is, detecting, analyzing, and correcting errors in an implementation of the system. When a system is implemented by means of programs executing in computers, there are many tools available for debugging the programs. There are discovery programs such as cia which help programmers understand how a program is organized and programs called debuggers, which permit the programmer to see what happens when a program being debugged is executed. Modern debuggers permit the programmer to interactively control the execution of the program being debugged. An example of such a debugger is the GDB debugger, available from the Free Software Foundation. Debuggers have further begun to use graphical interfaces to show information such as the call history of a program, events generated by distributed-memory parallel programs, or a trace of a parallel execution of a program. One example of such graphical interfaces may be found in Adam Beguelin, et al., "Visualization and Debugging in a Heterogeneous Environment", in: IEEE Computer, June, 1993, pp. 88-95.
The discovery programs and debuggers just described are perfectly adequate for their task; however, modern systems are typically implemented not just as sets of cooperating subroutines, but rather as sets of cooperating processes. For purposes of the present discussion, a process may be defined as the entity in a computer system which actually executes a program for a user. In many systems, the cooperating processes execute on different computers. When a system is implemented as a set of cooperating processes, debugging the system involves not only understanding and debugging the individual programs executed by the processes, but also understanding and debugging the cooperation of the processes. The latter tasks cannot be performed by the program discovery tools and debuggers just described.
Present-day computer systems provide only meager resources for debugging systems made up of cooperating processes. In computer systems employing the UNIX operating system (UNIX is a registered trademark of UNIX Systems Laboratories), for example, there is a trace utility which outputs list of the calls made by the process to the operating system. There are also an ofiles utility which tells the user what files a given process has open and a fuser utility which identifies what processes are using a given file. A drawback of even these meager debugging tools is that they can only provide information about processes executing on a single processor and consequently only have limited usefulness in understanding and debugging systems where the cooperating processes execute on different processors.
It is an object of the present invention to overcome the above problems with debugging systems made up of cooperating processes by providing techniques which permit close analysis and control of such systems.