Certain computing situations arise where it would be desirable to monitor processor memory references to a particular range of addresses in memory. One example of such a situation is in the debugging of computer software. Software errors, or "bugs", in computer programs can sometimes cause a processor to incorrectly calculate a write destination address. If this occurs, the information that was located at the improperly calculated memory address is overwritten and irretrievably lost. As a partial solution to this problem, operating systems are typically set up to assign each user program to its own specified range of memory locations, referred to as the program's addressable memory space. The operating system determines the amount of memory a particular user program will need and assigns that program a certain range of memory addresses defined by the base (the address where the programs memory starts) and limit (the address where the programs memory ends) addresses.
The program sees only the memory located between the base and limit addresses and does not have access to any other areas of memory. To the program, its memory is numbered from address 0 (base) to limit. This is referred to as the "logical" address. To get from a program logical address to the actual location in the actual physical memory address, the base address is added to the logical address. When a processor makes a memory reference request, the address is checked to ensure that the targeted address falls within that programs addressable memory space. If it does not, the processor is denied access to that memory location and an error message is generated.
This scheme maintains security on multiprocessing systems, in that users do not have access to the data or programs of other users. It also ensures that software bugs in one user program do not inadvertently destroy the information located in the addressable memory space of other programs.
Although limiting a program's addressable memory space prevents overwriting of data when processor attempts to access a location outside of its addressable memory space, it does not prevent a second type of memory destruction resulting from an improperly calculated destination address. This second type of situation can arise where a processor writes over the wrong memory location which is within its addressable memory space. For example, if a particular processor is running a user task which utilizes two data arrays, A and B, both within its addressable memory space, a bug in the program could cause the processor to miscalculate the destination address and put the data into A which should go into B. As a result both A and B would contain incorrect data.
This type of memory error would never be caught by circuitry checking for attempted access outside of the addressable memory space such as the base and limit scheme described above. As a result, by the time the error at the memory location is discovered (which could be many instructions later) it is very difficult to pinpoint where, when, and in what portion of the software the error occurred.
To isolate a software bug in single processor computers, programmers often insert "breakpoints" into the software code. For example, the program is broken into several parts, the portion of code between which breakpoints the error occurred is identified, several more breakpoints into that identified portion are set, etc., until the specific instruction fault is determined.
However, on multiprocessing computer systems where more than one processor may cooperate on a particular user task ("multitasking") it is difficult to tell which processor executed the software at fault and therefor nearly impossible to identify the faulty portion of code. This problem is due in part to the fact that several processors are cooperating on a single task and the user has no idea which processor is executing which portion of the code or even which processors were assigned by the operating system to work on the task.
The problem is further complicated by the fact that, on successive iterations through the same portion of code, the error may not surface at the same memory location, thus resulting in a "wandering" memory error. This can occur in a multitasking situation because the differing overall system environment on successive iterations may mean that not all of the same processors will come into the program at the same time and run on the same data as on a previous iteration. Once the program may run on five processors, the next time six, the next time four, etc. Also, a particular processor may well get different pieces of data and/or program to execute on successive iterations. Moreover, the timing of the code may be different every time through due to variations in processor memory reference priorities. All of these factors, which are constantly changing in a multiprocessing environment, may mean that a different processor gets a different iteration and fails in a different way every time through the erroneous piece of code. Thus, the problem of isolating which processor issued the faulty instruction causing the erroneous memory write to occur is a monumental task in a multiprocessing computer system.
Several debugging software packages exist which are designed to aid a programmer in pinpointing the problem code in a computer program. Debugging software automates the manual breakpoint setting procedures described above. Typically the debugging software is loaded onto a system along with a user program and essentially become a part of the program. The debugging software allows a user to set breakpoints in the user program. The breakpoints act as calls back to the debugging software, which in turn instruct the processor to check a memory location(s) (programmed by the user) to see if the location(s) has been modified. One way this can be done is to set breakpoints at the beginning and end of a user program subroutine. Thus every time a processor goes in or out of the subroutine, the debugging software is invoked and instructs the processor to check whether an address in memory has been modified. If the address location has been modified, the subroutine containing the faulty code has been identified and more breakpoints can be set within that subroutine to home in on the particular instruction at fault.
The debugging software described above works most effectively in a single processor system where only one processor is executing the program in an instruction by instruction fashion.
The debugging software does not work as effectively in a multiprocessing system. This is due to the fact that different processors are running in different subroutines and may be accessing the same range of memory addresses at the same time. Theoretically, every processor in the system could spot a modification, but since several different subroutines were running on different processors at the same time, the code at fault is not pinpointed to a particular subroutine. Thus, although debugging software can help a user to effectively pinpoint faulty code in a single processor system, they may be largely ineffective and inefficient for pinpointing faulty code which is running on multiprocessing computer systems.
Therefore, there is a need in the art for a debugging aid which can be used to pinpoint which processor in a multiprocessing computer system executed a faulty software instruction which caused a memory address to be calculated incorrectly by the processor and which further resulted in a write to the wrong memory location, destroying the data therein.