1. The Field of the Invention
The present invention relates to trace events for describing the runtime behavior of executing software. More specifically, the present invention relates to correlating one or more trace events to facilitate analysis of the one or more trace events based on how the one or more trace events are related to any other trace event.
2. Background and Related Art
In order to monitor the runtime behavior of executing software, developers often insert trace events into their source code. Trace events provide some information with respect to which instructions within the software have been or are being executed. For example, with respect to FIG. 1 (described in greater detail below), the executing software 120 includes three logical operations, labeled Logical Operation 1, Logical Operation 2, and Logical Operation 3. Each of the logical operations includes a trace event to indicate when the operation begins and a trace event to indicate when the operation ends.
This type of information regarding the runtime behavior of executing software is helpful in a variety of situations. For example, trace events can be used to help developers identify coding errors (i.e., as a debugging tool) and/or other types of software or system failures, as a support or administrator tool to help identify system problems, for generalized performance monitoring, to meter charges for software use, etc. Trace events are useful because there is often a significant amount of code that executes between display or other output, and therefore in many situations it is difficult to locate the cause of certain software failures. Furthermore, some software, such as services, may not include a significant amount (or any) display output, which, without trace events, can make narrowing the amount of code to search for errors all but impossible. Even worse, some software failures cannot be reproduced in a debugging environment because, for any number of reasons, the debugging environment masks the failure, which leaves little more than trace events as a debugging option.
For relatively simple software, rudimentary trace events, perhaps consisting of little more than a text string written to a display or a log file may be sufficient. Increasing software complexity, however, has lead to a need for increasingly sophisticated tracing. For example, note in FIG. 1 that Logical Operation 1 calls or accesses Logical Operation 2 and Logical Operation 3, which are nested within Logical Operation 1. Consider further that in many circumstances, logical operations represent reusable software components or objects. Accordingly, to the extent that developers are able to reuse software, each of the logical operations shown in FIG. 1 may be called or accessed in connection with a wide range of other logical operations, in addition to the particular arrangement illustrated in FIG. 1. In other words, analysis of Logical Operation 1, for example, should not include all trace events associated with Logical Operation 2, but rather only the trace events associated with Logical Operation 2, when Logical Operation 2 was called or accessed from Logical Operation 1.
Based on the foregoing, one level of complexity that tracing should account for is the nesting of logical operations (i.e., whether a logical operation is invoked within another logical operation). Ignoring transition events for now, the List 130 of trace events shows the sequence as: Start 1, Start 2, End 2, Start 3, End 3, and End 1. Relationships between the various trace events are based on correlation identifiers for each of the logical operations. As described in further detail below, the correlation identifiers (1, 2, and 3) for the list 130 of trace events show that Logical Operation 2 and Logical Operation 3 are nested within Logical Operation 1, and therefore should be considered during analysis of Logical Operation 1.
In practice, the number of logical operations performed during software execution is much larger than what is shown in FIG. 1 and the arrangement is significantly more complex. As a result, automated trace analysis tools have been developed to help filter trace events for those of particular interest. Determining which trace events are associated with a specified correlation identifier, however, is no trivial matter. Among other things, trace events are arbitrary, developerdefined, events, and therefore cannot always be expected to follow the relatively straightforward start and end arrangement illustrated in FIG. 1.
To help correctly associate trace events with a particular correlation identifier, transition events traditionally have been used to show the relationship between logical operations. Logical Operation 2 starts with Transition→2, indicating a transition from the previous correlation identifier to a correlation identifier of 2, and ends with Transition→1, indicating a transition back to the previous correlation identifier. Similarly, Logical Operation 3 starts with Transition→3, indicating a transition from the previous correlation identifier to a correlation identifier of 3, and ends with Transition→1, indicating a transition back to the previous correlation identifier. In this way, the filtering of trace events for Logical Operation 1 knows to include the nested operations, Logical Operation 2 and Logical Operation 3, even though the correlation identifiers may not otherwise explicitly match the filter criteria.
Transition events and maintaining accurate correlation identifiers, however, impose a fairly high degree of overhead. For example, developers are responsible for correctly adding transition events to their code. From the discussion of FIG. 1, it should be clear that inadvertently omitting a transition event can significantly change the perceived relationships between logical operations, and thereby undermine the overall effectiveness of trace events. Furthermore, trace events generally require access to correlation identifiers for parent and/or child logical operations so that transitions accurately reflect the changes in correlation identifiers when logical operations begin and end, as shown in the list 140 of correlation identifiers. In other words, downstream logical operations need to know something about upstream logical operations in order to perform transitions correctly.
Another level of complexity with respect to trace events relates to multiple threads executing the same software. For software that uses multiple threads, unrelated trace events become intermixed, making it virtually impossible for a human to explore a trace log in any meaningful way without automated filtering and analysis tools. (When multiple threads are present, at least one high-level logical operation generates distinct correlation identifiers so that trace events for the different threads can be analyzed separately and in the proper context. Distinct correlation identifiers also may be generated each time the high-level logical operation is invoked, so that trace events from different invocations that use the same thread at different times can be analyzed separately and in the proper context.) In addition to intermixing, multiple threads also tend to signal an increase in the number of trace events that are generated.
As a result, trace analysis tools have become increasingly complex in order to process the relatively sophisticated correlation identifiers being generated, thereby precluding developers from using more generalized and familiar analytical tools, such as spreadsheets or browsers. Sophisticated trace analysis tools also tend to be relatively slow because logical operations and the trace events they generate are related to each other using a post-processing analysis, based on transition events, and must be repeated each time the analysis criteria for filtering trace events changes. Accordingly, correlation identifiers that allow for analysis using more generalized and familiar analytical tools are desired.