The present disclosure relates to computer software, and more specifically, to a performance regression manager for large scale systems.
Any system (software, hardware, or both) must be tested thoroughly prior to release to ensure the highest quality and customer satisfaction. Large scale (also referred to as parallel or high performance) computing systems are no exception, requiring substantial efforts and resources due to their unique scale and features, which add dimensions and complexity to the benchmarking space, as well as the associated management and analysis of the generated data. For example, various customers may require executing different benchmarks, specific compilers, and specific libraries. These machines can run with different rack configurations, number of compute nodes, processes per node, optimizations and communication protocols, and thread levels. While automated testing frameworks have simplified the benchmarking space, they have not provided the ability to quickly and accurately manage and analyze the generated data. Furthermore, existing performance testing frameworks are specific to a given domain and can only handle metrics defined in that domain.