1. Field of the Invention
The present invention is generally related to software development tools and environments and, in particular, to an event trace and visualization tool supporting the dynamic instrumentation of target program code and collection of event data with minimal impact on the system behavior of the target program and system.
2. Description of the Related Art
Within software development processes, the specific detection and causal analysis of failure sources in software programs, particularly while executing, is a complex art. Although static analysis of program source code can identify potential problems, the most difficult to analyze failure sources are those that only occur when a target program is being executed in its intended execution environment, and even then only intermittently and unpredictably. Such failure sources most typically occur where the program under analysis must be responsive to real-time events, are subject to resource constrains, or involve complicated interactions between co-executing programs. Therefore, many failure sources may only become apparent when the program is executed under actual operating conditions. Known types of failure sources include unhandled events, unexpected contention, consumption, and exhaustion of program resources, latencies and improper code operations in varied circumstances, and the like.
Software-based trace tools are conventionally used for the detection and analysis of failure sources in executing programs. Such programs typically involve the insertion or instrumentation of the program under analysis with break points used to trigger the collection of information on the executing state of the program. Progressive analysis of the log files containing the collected information then provides a basis for detecting and understanding the cause of failure sources.
There are, however, a number of problems with the effective use of conventional trace tools. One is the effective requirement that the program under analysis be executed on its target hardware and within its normal operating environment. In many cases, the target hardware or operating environment is not suitable for direct software development use. Indeed, the target hardware can be a proprietary platform suitable for an embedded application to a general purpose computer system. Similarly, the target program may be an application program, operating system, device driver, or an embedded control program. Conventionally, then, a separate or development host computer system is employed for the visualization and analysis of data collected by a trace tool.
However, a complicating factor, is that the program under analysis may be any program, ranging from a dedicated program executing on embedded target hardware to the operating system kernel, device driver, or user-level application program executing on a general purpose computer. Where the program under analysis is highly customized or proprietary, or the target hardware is highly specialized, conventionally an equally customized trace tool is used to accommodate the software and hardware constraints of the target hardware and operating environment. The resulting trace tools are therefore unavailing in any generic or alternate environment and inapplicable to the development of generic or alternate programs.
Another problem encountered by conventional trace tools is that their use directly and substantially affects the system behavior of the program under analysis. The incorporation of the trace tool instrumentation and supporting information collection routines will intrude, both in terms of performance and space, on the program under analysis. Performance intrusion refers to the added execution overhead incurred whenever a trace point in the instrumented program code is encountered. Conventionally, performance intrusion is substantial, varying with the total number of potential data collection trace points that are instrumented in the program under analysis. In addition to the added execution time needed to actually perform data collection at a trace point, conventional trace point instrumentation typically also imposes a processing overhead of two unconditional interrupts and execution of the associated interrupt handling to identify the interrupt sources. The performance penalty due to these interrupts is incurred regardless of whether the trace points are functionally active to enable the collection of trace point data.
The first interrupt occurs in response to the execution of a break instruction inserted at the trace point. Where the break instruction is inserted into the binary image of the program under analysis, thus overwriting the byte storage equivalent of the binary break instruction, the overwritten code must be restored and the trace point address re-executed to maintain the proper execution of the program. A second interrupt, typically a single step instruction mode trap, is then required to restore the binary break instruction back to the trace point.
Additional performance intrusions occur as side-effects of using break instructions to establish trace points. Since the binary image of the program under analysis is modified twice in response to execution reaching a trace point, the processor cache typically must be flushed with each modification to ensure that the processor correctly executes the modified image. In turn, these repeated cache flushes may create new or mask existing failure sources in the program under analysis. A related side-effect arises from the need to hold off maskable processor interrupts whenever the program image is being modified. Typically, these interrupts must remain disabled for the duration of the trace point handling to ensure that the integrity of the modified image, including the trace point modification, is maintained.
The addition of the trace break instruction handling and data collection routines, and the allocation of a typically large data collection buffer, can create a substantial space intrusion on the program under analysis. These increased memory requirements typically reduce the available system resources to the program under analysis. This, in turn may cause other performance related side-effects, such as a more frequent need to re-allocate available memory resources. Space intrusions may also produce relocations in different parts of the program under analysis, which may then mask or alter the occurrence of certain failure sources, such as pointer overruns.
Performance and space intrusions both operate to directly and unpredictably alter the system behavior of the program under analysis relative to the handling of ordinary event and task processing. Such changes in system behavior, even if they appear superficially minor in nature, are recognized in the art as potentially, if not likely, to create or greatly distort the occurrence of failure sources in the program under analysis. Consequently, trace analysis of the program under analysis will produce an inaccurate picture of the performance of the program in its nominal operating environment.