High-performance computing (HPC) applications are typically run on a dedicated cluster. Significant delays can occur while waiting for the cluster to be available for exclusive use, and long wait times can be experienced by other applications waiting for an HPC application to finish. Also, HPC applications often require periodic synchronization and exhibit performance imbalance among its various threads running on different nodes due to non-uniform hardware, inherent workload and/or computation characteristics, changes in resources available to them because of other competing applications, etc.
Running HPC applications on a non-dedicated cluster commonly impacts the non-HPC workload. Hence, the compute resources allocated to HPC applications across nodes should be optimized in such a way that there is no wastage. However, it is observed that nodes running slower HPC threads hold up the synchronization step even if other (faster) nodes have finished computation. This performance imbalance results in wastage of compute resources. Also, another challenge with running HPC applications on non-dedicated clusters is that the compute resources available to HPC threads vary over time as non-HPC workload executes.
Existing approaches attempting to run HPC applications on non-dedicated clusters include balancing performance of HPC threads that suffer from performance imbalance caused either by the presence of other competing threads from non-HPC applications that cause changes in resources available to HPC threads or because of inherent workload imbalance amongst the different HPC threads. However, such approaches ignore the impact on competing non-HPC workload.