The present invention relates to a system for debugging computer programs, and more particularly, to a system that provides concurrent operation of several computer program debuggers within one computer.
The design and use of computer hardware and programs are well known in the art, and need not be described here in any great detail. The following overview is presented merely to provide a context within which the present invention may be understood.
A computer program, also referred to as softwares, is a set of instructions that directs the functioning of various computer hardware resources in order to accomplish a particular task. In order to run a computer program, that program is typically loaded into the computer's main memory, where each instruction within the program is stored at a unique location, specified by an address. The address locations occupied by the program is referred to as the instruction space of the program. During program execution, the computer's control unit fetches and executes instructions in sequence. Fetching begins at a predefined start location within the program, and continues in sequence unless some type of branch instruction is encountered, or some other event, such as an interrupt, occurs. Branch instructions and interrupts will cause the control unit to begin fetching instructions from a new location that is not the next sequential instruction address within the instructzion space. Program execution then proceeds in sequence beginning at this new memory location, until another branch or interrupt is encountered.
Although each computer instruction is really a set of electrical signals, each of which typically assumes one of two values, those who create, or write, computer programs usually use symbols that represent the various possible combinations of electrical signals. At the lowest level, these symbols may simply be a string of ones and zeroes, representing on a one for one basis each of the electrical signals that make up an instruction. More often, however, these symbols comprise alphanumeric characters which are arranged to form mnemonics in a programming language, each mnemonic representing an instruction or part of an instruction.
Programming languages, themselves, come in a variety of styles ranging from low level to high level. The lowest level languages are characterized by instruction mnemonics which are, for the most part, in a one to one correspondence with the set of machine level instructions (i.e., the electrical signal combinations that the computer controller hardware recognizes as instructions). Such languages, such as assembly languages, are cumbersome to work with, and require an intimate knowledge of the physical architecture and operation of the computer hardware resources. Such languages provide the advantage, however, of permitting low level control. of the computer resources, which in turn can result in programs of minimal size that run very fast.
At the other extreme, high level programming languages allow the programmer to direct the computer operation by means of constructs that mimic English-like phrases, such as "if . . . then . . . else . . .". While such high level languages are much easier for a human to write and understand, they usually cannot be directly understood by the computer. Instead, programs written in such languages must be converted, or "compiled" into a low level form that may be loaded and executed by the computer hardware. A drawback of this is that the compiler may generate low level code that is not as efficient as the set of low level instructions that the programmer would have produced, given the same task.
In addition to varying levels of programming languages, another aspect of creating a computer program relates to the fact that certain tasks, such as writing data to a video display terminal, reading a character from a keyboard, and accessing disk storage, are found in most programs. Since these tasks can rarely be accomplished with the use of a single machine level instruction, but rather each require a small program, it is inefficient to force each programmer to create these sub-programs within his own program. Consequently, such commonly used routines are typically organized within a single operating system which is executed on the computer hardware. The operating system can serve as an intermediating layer between the actual computer hardware, and the user's application program. Whenever the programmer needs to include a system function within the application program, a program instruction requesting that service, such as by means of a subroutine call, is encoded within the program. By the time the program is ready to be executed, the system components have been linked to the application program, so that any one program may be made up of portions that a particular programmer provides, and portions that are provided by others including the manufacturer of the operating system. During execution, the system subroutine call performs the requested system function, and then returns control to the application program.
One thing that all programs have in common is the need to ensure that they actually perform the task that they are designed to perform. The act of making this determination is generally referred to as testing the software, and the act of identifying the cause of a known problem, or "bug", in a program is called "debugging" the software. To facilitate this process, computer programs, called "debuggers" have been created. A debugger supplies a program control interface to the programmer that allows one to do such things as executing only one program instruction at a time (referred to as "single stepping" the program), determining what the next instruction to be executed is, examining and/or modifying computer register and memory locations, and setting breakpoints at particular locations within the program, whereby computer program execution will continue unimpeded until the breakpoint is the next location in the program that is to be executed by the computer. These features, and others, greatly assist the programmer in determining whether the sequence of program instruction execution is as expected, and whether the correct data is being moved from one computer resource to another. This view into the actual operation of the program allows the programmer to identify where an error has been made in the program design.
Because, as explained above, computer programs may be written in any of a variety of programming languages, and may additionally rely in varying degrees on operating system features, no one debugger will be the ideal debugging tool for all programs. Instead, different debugging tools will be most useful at correspondingly different points in the development cycle of a program. For example, during the first few weeks of an operating system bring up effort, the most helpful debugger may be a low level assembly language debugger that makes few assumptions about which operating system features are available. This may be seen by considering an example, such as debugging a memory management feature in an operating system. A memory manager manipulates certain hardware features so that an executing program sees its memory mapped in a way that makes programming easier by, for example making a "virtual" or "effective" address space appear to be a monotonically increasing set of addresses, when the physical addresses are not. It is apparent, in this instance,, that a debugger must not itself, in an attempt to access a particular memory location in response to a user command, be permitted to use the memory manager that it is trying to debug. Instead, such a low level debugger would essentially include only core features that are common to all debuggers, such as:
1) setting execution breakpoints;
2) examining and altering memory; and
3) examining and altering a saved processor state (i.e., the processor's general and special registers, any memory mappings, etc.).
Later, when the system is more mature, the most helpful debugger might be one that understands and uses operating system tables during a debugging session. Furthermore, in the lab, an engineer might find it most helpful to use a debugger having a windowed environment that permits the browsing of, and single stepping through, high level language source code. With such a debugger, the user need not be concerned with such things as the physical addresses of memory locations in which data variables are stored; instead, the debugger could allow the user to examine and modify data variables merely by specifying the symbolic name for the data variable that was used in the program.
Also, the operating system itself may provide another debugger that is aware of operating system tables, such as process control blocks, memory maps, loaded module areas, and the like.
Finally, after the program has completed the test phase and has been distributed, mysterious program failures (called "crashes") that occur in the field may be most easily debugged with the use of a simple low level debugger.
The above description is not to be construed as a hard and fast rule concerning what type of debugger to use at any particular time, since it is often the case that a programmer may find it desirable to be able to switch back and forth between any of a number of debuggers. Such may be the case, for example, if the programmer wants to use a first debugger to quickly single step through high level language program instructions up to a particular program location, and then switch to a low level debugger that will not only allow intimate access to the computer hardware, but also not rely on the use of operating system tables that may not be working. In addition, if one debugger selected as the preferred debugger is not available at the time of an exception condition, a second debugger may be invoked in its place. Later, when the preferred debugger becomes available, the programmer may wish to continue the debugging session by switching to the preferred debugger without losing the current machine state.
Thus, at any given moment one may wish to select for use a subset of debuggers from among a variety of debuggers, each having a particular set of capabilities. However, existing debugger technology has heretofore permitted only one debugger at a time to be operative on a single running program in the machine. As used in this discussion, the word "operative" is used to mean that any of following is true:
1) that the debugger may be entered due to a programmed condition, such as a "hardcoded" break instruction, without any further human intervention and without any process, with which the debugger is associated, being the "active" process at the time that the programmed condition is encountered;
2) that a human operator previously utilized the debugger to establish a debugging state (e.g., active execution breaks) which, when matched by the machine state, causes the debugger to be entered and control returned to the human operator, regardless of whether or not the debugging state occurred at a time when a process associated with the debugger was active; or PA1 3) that the debugger is currently interacting with a human operator and the program's execution.
For example, on computers that include special hardware for providing a memory protection feature, the absence of any means for multiplexing this hardware prevents more than one debugger from being active at a time. Furthermore, the inability to multiplex hardware control or register modification in a deterministic way prevents current multiple-debugger environments from debugging an operating system that would otherwise be responsible for maintaining the state of a running process. Consequently, problems arising in the field may be more difficult to debug because of the inability to switch, at the instant that a program "crash" is encountered, from a first debugger to a second debugger with different abilities without first exiting the first debugger, thereby changing the state of the machine.
Some prior art operating systems, such as the UNIX operating system, include certain debugging capabilities that allow multiple user level debuggers to appear to operate concurrently, although not for the purpose of debugging the same running program. As descrj. bed, for example, in "A/UX Programmer's Reference, Sections 2 and 3(A-L)", published by Apple Computer, Inc. in 1990, a process trace ("ptrace") command is provided that gives a parent process a means for controlling the execution of a child process. This is most useful for implementing breakpoint debugging. In this instance, the parent process is a debugger program, and the child process is the program being debugged. The child process behaves normally until it encounters a predefined signal, at which time it enters a stopped state and its parent is notified. When the child is in the stopped state, its parent (i.e., the debugger program) can use ptrace to examine and modify the child's "core image". Thus, the ptrace command provides: a way for a debugger program to gain access to the child's state information that is being stored in the UNIX operating system resources.
However, in this environment the user cannot use a first debugger (parent process) to debug the child process up to a certain point and then switch to a second debugger that provides a different set of debugging capabilities because there is no mechanism for the second debugger to access the first debugger's child process. Consequently, even though the UNIX operating system is a time-sharing system that allows multiple debugger programs to appear to run concurrently, the system lacks any centralized means for controlling the debugging of one process by multiple debuggers. As a result, each running process in the system is only capable of invoking a single debugger (i.e., its parent process).
A corollary of the decentralized nature of debugging control in UNIX is that portability of each of the individual debuggers is limited, because each must have intimate knowledge of the workings of the particular processor upon which the system is built in order to implement standard debugger features. For example, while the ptrace command is useful for the implementation of breakpoint debugging, the command does not, itself, provide any facility for setting or removing a breakpoint. Instead, each individual debugger running under UNIX must use multiple calls to the ptrace command in order to first retrieve a copy of a child process instruction (referred to here as an "original instruction") that presently occupies a desired breakpoint location, and then to store into that breakpoint location a signal instructLion. The particular signal instruction must be selected by the person who writes the debugger program so as to cause, when the signal instruction is executed, the child process to be suspended and the parent (debugger) process invoked. When it is desired to resume execution of the child process, the debugger must further be responsible for ensuring that the original child instruction (that was replaced. by the signal instruction) gets executed, and that the currently desired set of signal instructions are again in place before execution of the child process is resumed.
Any debugger that is implemented by means of the UNIX ptrace mechanism has the further drawback of being incapable of following a line of program execution into the UNIX kernel to debug problems there. This is because a process-based environment (i.e., one in which the actual running of a process is scheduled by an operating system kernel) is a necessary component of any debugging system that is built around the ptrace function. Since the UNIX kernel is not, itself, associated with a process, it follows that it cannot be a child process of a debugger.
In addition to the need and/or preference to have concurrently operative debuggers of varying levels, the need to have more than one debugger capable of responding to any given machine state condition can arise if a particular debugger cannot be accessed. This can happen in the case of a two machine debugger which requires that a second processor, which runs debugger software, be coupled to the computer that is running the software to be debugged. If the second processor is not attached, then the execution of any programmed breakpoints associated with the two machine debugger would cause unpredictable results. It would be preferable, in such a case, to have the computer default to a lower level debugger in response to the breakpoint condition, so that the state of the computer (which contains valuable information for determining why the breakpoint was executed) would not be lost.
Thus, it is desirable to provide a program debugging system that is capable of having more than one debugger operative at a time.