Computer systems execute programs that solve complex computational problems.
Preferably, the programs achieve high levels of performance, reduce wasted computer resources, and execute at peak speed. “Performance analysis” is the process of analyzing and understanding the execution characteristics of programs to identify impediments that prevent programs from running at peak speed, or their highest level of performance.
The amount of information required to completely characterize the execution of a program is massive, however, and it is therefore difficult or impossible to analyze all the data manually. Current automatic “performance analyzers” present performance data textually or graphically and direct the user's attention to patterns that may indicate a performance problem. These tools, however, lack an understanding of the meaning, or “semantic knowledge,” of the analyzed program, which limits their effectiveness in solving performance problems.
For example, performance analyzers generally attempt to identify algorithms that ineffectively use computer resources. To do this, conventional performance analyzers may identify parts of a program that take a long time to execute. This heuristic, however, may be deceptive. For instance, such an analyzer would identify a well-written algorithm as a poorly-performing algorithm simply because it unavoidably requires a lot of time to execute. Such an analyzer would also fail to identify poorly-performing algorithms because they do not take a long time to execute or because they are not central to the program. Without knowledge of the semantics of the programs, or how program components are supposed to run, an automatic performance analyzer cannot adequately determine whether a particular component of a program exhibits poor performance.
Performance analysis is also important in multiprocessing computer systems. A multiprocessing computer system comprises multiple processors in which different portions of a program execute in parallel in the different processors. Or, it is a system in which a program executes in parallel over multiple computers, each with a different processor. In such a computer system, resources may be wasted if processors are idle (i.e., not executing a program instruction) for any length of time. Thus, an automatic performance analyzer identifies algorithms that do not effectively divide tasks over the available processors, i.e., they have low “parallelism.” Conventional performance analyzers generally attempt to identify algorithms with low parallelism by indicating instances during program execution when one or more of the processors are idle. This may indicate when the program is not using the available processor resources as well as it could. Such a heuristic, however, may also identify instances when processors are expected to be idle, such as during the traversal of a linked list by a single processor. Further, even during the course of executing an extremely efficient program, the number of instances that one or more processors may be idle could be one billion or more. Conventional automated performance analyzers are incapable of distinguishing instances when the processors are expected to be idle from instances when they are not. Therefore, without knowledge of the semantics of the program, or how program components are supposed to run, automatic performance analyzers cannot adequately determine low parallelism portions of programs.
Thus, there is a need for performance analysis that identifies performance impediments based on an understanding of the meaning, or semantic knowledge, of the portions of the program being analyzed.