1. Field of Invention
This invention generally relates to methods for analyzing the execution behavior of software programs. More specifically the present invention relates to methods embodied in software tools by which a user may interactively explore information gathered during a program""s execution.
2. Prior Art
Given only a program""s source code (its static description), the behavior of the program as it runs (its dynamic behavior) can often be very difficult to predict. Programs often run incorrectly, more slowly, or use more system resources than a programmer expected. Diagnosing these problems, which requires understanding the dynamic behavior of the program, can be a difficult task even for the most experienced programmer.
Various categories of software tools, such as profilers, debuggers, and program visualizers allow the user to study a program""s behavior in different ways. These tools are helpful at times, but they are often ill-suited to both the complexity of the program being analyzed and to the nature of the analysis process. As a result, programmers work slowly with existing tools, construct their own ad hoc tracing schemes in order to analyze a specific problem, or just work xe2x80x9cin the darkxe2x80x9d much of the time.
Dynamic behavior may be studied by collecting a program trace for analysis, containing information about events that occurred and resources used during the program run. The analysis may be done after the run has completed, or it may be integrated with the collection process.
Even relatively simple programs will often involve a lot of internal activity, and more typical programs can generate extremely large and complex traces that can be overwhelming for the analyst to study. As in other fields where large amounts of information are studied, analysis tools must provide ways to organize the information: to filter out irrelevant information, and to group related information so that fewer, higher-level units may be studied. These organizing abstractions may then be used to compute summary measurements, or for structuring or rendering elements in visualizations.
Because of the complexity of programs, a central problem when analyzing their behavior is that information of interest is often intertwined with unrelated information. The problem may manifest itself as visual clutter in detailed graphical views, or as numerical summaries skewed by the inclusion of irrelevant information. In either case, key information needed for analysis can remain hidden without the right organization. The choice of organizing abstraction at each step in an analysis is crucial for enabling the user to work with the information productively.
A number of organizing abstractions are currently provided by various tools, and have proven useful for many situations. These include organization based on static units, for example threads, classes, methods, and instances when analyzing a Java program, and certain types of organization based on dynamic information, such as call trees and patterns. There are many important cases, however, where the information of interest does not line up with these organizational schemes, and the most appropriate organization will be based on more complex combinations of static and dynamic criteria. Here are a few examples:
1. The user would like to study a functional aspect of the program, not represented formally in the static structure of the program, involving a few methods from a number of different classes.
2. The user would like to contrast the behavior of the slowest 10% of invocations of a given method with the fastest 10%, to understand the difference.
3. The user would like to study only the instances of a given class that were used in a particular way, such as all the Java Vectors that were created in the course of a specific sequence of method calls.
In general, it is necessary to provide the user the flexibility to filter and group the information as needed for a particular problem being studied. This must be in addition to other, more fixed, organizational schemes, which are also useful. Note that flexibility in organizing the information, while enabling the analysis, potentially introduces additional usability difficulties in specifying complex filtering and grouping criteria. It is important to address these difficulties as well.
In addition to the complexity of the data being studied, the analysis process is often a long and unpredictable process. The user may or may not know in advance what questions he or she needs answered. The user is just as likely to be searching for clues that will give insight into a problem as trying to obtain answers to precise questions to verify a hypothesis. It is therefore essential for an analysis tool to support many styles of analysis. The user may, for example, use graphical views to gain an overall understanding of the flow of events, discover patterns, or spot anomalies. At other times the user may need to make quantitative comparisons of resource usage. User-defined filtering and grouping units must be usable for all of these types of study.
Many different types of views will usually be needed within a single study, in order to study a problem from different angles. The user must be able to maintain a consistent context across these various views. This context would include not just different presentations of the same information, but different types of information as well. For example, the user may start by identifying a certain group of instances as an aspect of the execution to study. In one view, the user may then want to see how much time each thread spends on activity related to these instances, and in another view see the detailed sequence of method calls against the instances to understand when and how they are used. It is important that the user be able to maintain a consistent context while working with different types of information. It is important that the user be able to move from one type of information to another, without having to restate complex filtering and grouping criteria.
Analysis is often a lengthy, experimental process. At a given stage, the user may have a particular focus of study and an organization of the information based on a hypothesis on where the problem may lie. The user may then study the problem within this framework, using various views over various types of information, as discussed above. In the process, the user may make discoveries that will change the course of the investigation. The user may decide, for example, to refine the focus of study, or organize the information differently based on new information. The user may also want to try out a hypothesis, with the ability to backtrack to an earlier stage of analysis if the hypothesis is incorrect. The user may also want to try out multiple alternative analysis paths simultaneously focusing on different aspects or using different organizational schemes and compare the results. In general, it would be helpful for an analysis tool to maintain multiple working contexts for the user, to help structure the larger analysis process.
An object of the present invention is to enable the analysis of dynamic program behavior for problem diagnosis.
Another object of this invention is to provide a method that enables the analysis of dynamic program behavior for problem diagnosis, and that is also applicable to other applications of dynamic program behavior analysis, such as program maintenance, testing, and program characterization.
These and other objectives are attained with a method and system for analyzing dynamic behavior of a computer program using user-defined classifications of an execution trace. The method comprises the step of forming a database describing the executions of the program. The database includes stat information obtained from the program source, and dynamic information describing particular executions of the program. The database is structured into entities, and each of the entities is comprised of a single type of information about the program execution. Each entity is comprised of elements representing individual program elements of said single type, and each element has attributes with values describing the element. The database is augmented by classifying every element of the database as a member of zero or more user defined execution slices; and dynamic behavior of the program is analyzed using the execution slices.
The preferred embodiment of the invention is for use with a database describing a program""s execution, containing both static information (obtainable from the program source) and dynamic information (describing a particular execution of the program). The database is structured into entities, each containing a single type of information about the program execution (for example threads, classes, method invocations). Each entity consists of elements representing the individual program elements of that type (for example, an element may represent an individual thread, class, or method invocation). Each element has attributes with values that describe that element of the program execution. Some attributes are relationship attributes, whose values relate an element to one or many other elements. Some attributes are summary attributes, whose values are aggregations of attribute values from a number of other elements.
In accordance with the present invention, the user may augment the database by classifying each element of the database as a member of zero or more user-defined execution slices. Each execution slice is based on an arbitrary combination of static and dynamic criteria, and represents an aspect of interest to the user for analysis.
The set of execution slices serves as an additional dimension with which to access information in the database, independent of the predefined entity structure. Each execution slice may thus be used as a lens through which to access the entire database. This mediated view of the database will have the following properties:
1. the structure (of entities and attributes, including relationship attributes) will be identical to the original database structure
2. only elements that are members of the given execution slice will be present
3. the values of many-valued relationship attributes will vary depending upon the given execution slice
4. the values of summary attributes will vary depending upon the given execution slice.
In effect, summary attributes become multidimensional summary attributes, summarizing information by both an element of the database and an execution slice.
Execution slices may be arranged as a hierarchy, where the root execution slice contains every element in the database describing the program execution. The elements of each execution slice are a subset of the elements of its parent execution slice.
Execution slices may be used to filter the elements presented in views. Execution slices may also be used to visually classify the elements presented in a view. Execution slices may also be used to filter the information used as the basis for summary computations. In addition, execution slices may be used to perform comparisons of multidimensional summary information.
An execution slice (the base) and its immediate child slices in the hierarchy (the subsets) may be used together in a single view to provide a combination of filtering and visual classification. An execution slice (the base) and its immediate child slices in the hierarchy (the subsets) may be used together in a single view to provide comparison of multidimensional summary information of the subsets against the base.
Recasting rules are introduced to simplify the specification of execution slices. Execution slices are defined by a combination of filtering queries and recasting rules. For a given execution slice, a filtering query on each entity specifies which elements of that entity will be a member of the execution slice. Recasting rules interpret these queries to cause additional related elements to be included in or excluded from the execution slice.
In addition to allowing the user to specify the query criteria directly, we allow the user to define a set of execution slices by interactively selecting elements of the database. Alternatively, the user may define a set of execution slices by selecting an attribute. The values of the attribute will be used as the basis for classification into a set of execution slices.
Also disclosed herein is a method for the user to structure and manage a lengthy, experimental analysis process using execution slices. A workspace may be defined consisting of an execution slice (the base) and its immediate children (subsets). Using a workspace, the user can perform an analysis using any number of views, and they will share a consistent information context of these base and subset slices. The base slice is used to set the overall scope by filtering out irrelevant information in each of the workspace""s views, while the subset slices are groupings of information within the limited scope of the base slice.
The user may work with multiple workspaces simultaneously, allowing multiple experimental contexts to be maintained.
The user may select a number of elements in one view, and navigate to other views to study related elements. A temporary execution slice is used to automatically recast the information for presentation in views over related information.