There are certain faults that will cause otherwise healthy software products to fail. These typically include hardware faults or massive software faults, such as attempting to access memory that is invalid or in a manner that is invalid. Examples of fundamental software faults include attempts to access memory using an unaligned memory address or a null pointer. A fundamental fault is a fault so serious in scale that the software product cannot continue to operate and typically needs to halt immediately. For example, the Windows™ operating system may indicate a General Protection Fault (GPF) in the case where a software product attempts to access a memory address that does not exist. In the Unix™ operating system, fundamental faults may generate a “signal”, which is a notification to the active process from the operating system that causes the active process to stop what it is doing to deal with the signal. For example, a SIGSEGV signal is triggered when a process attempts to access an illegal memory address. These types of serious faults may be referred to as traps.
When a software product encounters a trap it receives a notification from the operating system and the execution of the process is halted. The operating system may then call a registered function or module for handling the trap. In a typical system, a function or module may be registered with the operating system upon start up as the function to call when a trap is encountered. Such a function or module may be referred to as a “trap handler”. By way of example, in the Unix™ operating system, there is a “signal handler” that the operating system will call when a signal is generated. There may be default signal handlers for dealing with particular signals. Other platforms allow a separate trap handler process or program to be notified when a trap occurs, for example, the Windows™ operating system. A trap handler function or module may be custom developed by a software product developer and registered as the handler for a specific signal. For example, with the Unix™ operating system, if the default handler for addressing the SIGSEGV signal is considered inappropriate or inadequate, a developer may design a customized signal handler, using the sigaction () system call. The sigaction () system call accepts a signal number, the new behaviour for the signal (potentially including the signal handler), and the old behaviour for the signal.
Typically, a trap indicates a significant problem that prevents the healthy operation of the software product, so the trap handler will initiate the termination of the process. If the process is unable to recover from the trap, it will typically exit from within the trap handler.
In known systems, the trap handler may perform some basic operations to preserve information for the ex post facto or post mortem review by a software product developer in attempting to identify the source of the trap. The trap handler may be provided with the register context, a pointer to the register context or the ability to get the register context by the operating system. The register context is a snapshot of the operating system registers used by the process or thread. The register context may include the program counter that indicates the address of the trapped instruction, as well as other information. To preserve this information so as to assist the developer in analyzing the trap, the trap handler may open a file, write the register context to the file and then close the file. Other system information that may be written to a file by a trap handler includes the function call stack or stack trace, or a portion of memory corresponding to the list of instructions containing the trapped instruction. The information stored in the file can then be used by the developer to identify the trapped instruction and what the contents of the registers were at the time the trap was encountered.
A problem encountered with known trap handlers is that the preserved information in the file only provides a small snapshot of some basic system information. For example, if the file contains the instruction list and the register context, then during later review the software developer may determine that the trap was encountered upon a load instruction. The developer may deduce that the trap likely relates to an invalid memory location referenced in the load instruction. As the invalid memory location is likely to be an address contained in a register, the developer may be able to trace the source of the invalid address in the register to a previous instruction, which loaded the register with the contents of a particular memory location. In these circumstances, the developer would be unable to trace the problem any further without access to the contents of the memory.
To address the problem of a lack of information, some trap handlers may attempt to preserve a much larger quantity of information, including the contents of any allocated memory locations. In more complex systems, this can result in the dumping of Gigabytes, or in the future Terabytes, of information, which may be expensive, time-consuming and problematic for sending from a software product user to the software developer.