1. Field
The present embodiments relate to determining similarities in computer software codes for use in execution performance analysis.
2. Description of the Related Art
Characterising the performance of computer software codes (applications/executables), and finding similarities between their execution profiles, relies on interpreting the outcome from profilers, i.e. software profiling tools that perform a profiling process. Profiling is a form of program analysis that measures a variety of metrics that characterise software applications, with focus on run-time behaviour given a particular input. Non-exhaustive examples of such metrics are the elapsed time from the beginning to the end of the program execution, and the percentage of communication time in a distributed parallel program over its total elapse time, etc. However, the outcomes of profilers are typically not straightforward to follow, and often require deep knowledge of their functionalities. Moreover, a user needs to be fluent in understanding the metrics such profiling tools produce, which are typically presented in a text-based format, in order to be able to read, analyse and compare more than one profile outcome. As a result, a steep learning process for acquiring the knowledge needed for understanding working details of the profiling technologies is required. In addition, since the analysis of these profiling results can be laborious and time-consuming, this manual process can adversely impact users' work productivity.
Further, it is often the case that different versions of the same code (application/executable) are to be characterised. These execution versions (or benchmarks) may differ in how they are run on the possibly different computing hardware with also possibly different software settings, where different settings are applied for each benchmark before obtaining the associated profiles (code performance). As a result, different profilers may need to be used to capture the relevant metrics for these various and different execution environments/settings. It is evident that the more exhaustive the profiling process is, the higher the number of different benchmarks is required. Therefore, a plethora of text-based information is also produced. As a result, the requirements for comprehending and processing the resulting wide-ranged metrics, produced in a text-based format, are also exhaustive.
Moreover, it is typically the case that each profiler will have its own definition for representing a given computing metric. For example, each of the three definitions ‘elapsed time’, ‘CPU time’ and ‘run time’ can be used in three different profilers to represent the same quantity, that is ‘execution time’. Therefore, obtaining a standard format for representing a particular metric and comparing its values against those produced by the other benchmarks, which may also have been produced by different profiling tools, can be tedious and very inefficient. While some of these profiling technologies have further been provisioned to provide visual interpretations (images) on codes' performance, so-called ‘trace files’, such images between profilers are never of a standard format. Each profiler has its own style for representing its images on the resulting trace analysis of codes' performance.
Accordingly, it is desirable to provide an effective, automated and easy-to-use mechanism for finding similarities, and/or differences, in profiling metrics for software codes (different applications/executables and/or different instances of the same application/executable), for use in analyzing execution performance.