1. Field of the Invention
The invention relates to tools used in performance monitoring of computer programs and, in particular, to a controller of sequential tools used to monitor performance of computer programs that are executed in a parallel computing environment.
2. Description of the Prior Art
Computer programmers widely use software tools to monitor the performance of computer programs while a computer system executes the programs. Typically, as the computer system executes a program, such tools collect specific result data, such as specific variable values or calculation results, from specific locations in the program. One such tool is a symbolic debugger (hereinafter simply referred to as a "debugger").
Computer programmers use debuggers as tools to identify mistakes or malfunctions in a computer program. More specifically, debuggers are used to detect, locate, isolate and eliminate errors in computer programs. In essence, a debugger is a special computer program capable of controlling execution of a computer program under test (hereinafter referred to as a "user program"). While controlling execution of a particular user program, a debugger monitors the operation of the user program by collecting, as the program executes, data concerning program performance. From this data, a programmer can determine whether mistakes or malfunctions have occurred during execution of the program.
One specific type of symbolic debugger is a so-called debugger extension (dbx) (hereinafter referred to as a dbx tool). The dbx tool is widely used by programmers of computers using a UNIX operating system (UNIX is a registered trademark of UNIX System Laboratories, Inc. of Summit, N.J.). One such UNIX operating system is the Advanced Interactive Executive (AIX) operating system produced by International Business Machines Corporation of Armonk, N.Y. (AIX is a trademark of International Business Machines Corporation of Armonk, N.Y.). This specific operating system is illustrative of the many types of operating systems which operate in conjunction with the dbx tool.
Generally, the dbx tool is a symbolic, command line oriented debugging program capable of debugging programs written in C, C++, Pascal, FORTRAN, or COBOL programming languages. The dbx tool allows a user to examine both object and core files while providing a controlled environment for executing a program. To facilitate the controlled environment, the dbx tool permits the user to set breakpoints at selected statements within the program. In this manner, the dbx tool executes the program up to the breakpoint and stops. Typically, at the breakpoint, the dbx tool displays to the programmer a particular result or variable arising from execution of the program up to that point. Alternatively, the dbx tool can execute a program one line at a time and display results as each line executes. Additionally, the dbx tool debugs programs using symbolic variables to collect data as the program executes and displays the variables in their correct logical format. As such, a programmer can quickly comprehend any errors within the program.
However, the dbx tool becomes very cumbersome and, at times, impossible to utilize in a parallel environment. A parallel computing environment generally comprises a plurality of computing units connected to one another through a computer network. Each computing unit in the parallel computing environment is known as a node. When a program is executed in such an environment, various program tasks which comprise the program are executed on various individual nodes. Each task is accomplished in parallel; data from those tasks are passed, along the network, from one task to another. The dbx tool is not designed to debug a program that is executed in such a parallel computing environment. In that regard, dbx tools can only debug an individual task, i.e., a program serially executing on a single individual node. Moreover, in order for a user to be able to enter commands to the dbx tools, each individual task must be executed on a node with an attached terminal. Because each node must be debugged individually, debugging is a tedious and time consuming process when applied to parallel programs that execute on many nodes. Additionally, some nodes in a parallel computing system, known as computing machines, may be embedded within the system, i.e., without direct terminal connections, and consequently, do not allow a dbx tool to be used at all. Moreover, because the various tasks are executed in parallel and pass information among themselves, an error occurring in a task not presently being debugged may be indicated as an error in the task being debugged. As such, the source of the error may not be immediately known to the programmer. Accordingly, the programmer may waste a significant amount of time attempting to correct an error in the program task presently being debugged, when in fact the error has occurred in another node.
Thus, a need exists in the art for a controller that enables sequential tools, such as the dbx tool, to be utilized in a parallel processing environment. Additionally, the command structures used by the controller should be similar to the dbx tool to enable computer programmers familiar with the widely used dbx tool to quickly understand the new controller.