Performance profiles consist of a set of data which is gathered by a process (the profiler) that is running concurrently with a set of other processes/applications in order to monitor performance of those processes/applications. The dataset gathered is called a profile.
It is desirable to identify associated blocks of statistical information in a hierarchy such as the output from a simple system profiler showing CPU time split by processes, which have child threads also split by time, which have different code modules in turn, so as to allow combination and comparison of information relating to those units. Previously known solutions to this problem involve matching the names of units.
However, this approach has the disadvantage that while this is an ideal solution when units are named, and common units are always commonly identified, this solution falls apart when units are unidentified.
A need therefore exists for matching of processing units based on statistical analysis of units and their child elements (automation of high quality performance profiling by statistical means) wherein the abovementioned disadvantage may be alleviated.
Furthermore, applications in general tend to show differences in such profiles due to indeterminism introduced by distinct random factors such as processor affinity, process scheduling, and so forth. During profiling, those differences show up in different performance values for equivalent processing units. Since processing units are generally not labeled, those differences make the task of finding a relation between equivalent units over multiple profiling periods difficult. These differences, which are random (in a stochastic sense, so not being arbitrary) over a set of profiles, may be called the internal noise (IN) of the profiles. It is desirable to reduce this noise and to estimate its dimension.
In addition to such internal noise, there sometimes exists what may be called external noise. Such noise is characterized by a very high impact on the profile data in comparison to the impact of internal noise. Such external noise is unexpected noise. It is caused by processes/applications besides the profiler process and the applications being profiled. For example, in runtime environments based on virtual machines, a garbage collector might cause significant external noise. In comparison to internal noise, external noise has no stochastic distribution (it is arbitrary, not random). A performance profile that contains significant external noise is considered not clean, and may simply be called a bad profile or a bad run. It is desirable to identify bad runs.