Technical Field
Embodiments described herein relate to processing devices and more particularly, to managing variations among nodes in parallel system frameworks.
Description of the Related Art
Parallel computing is the simultaneous execution of the same application or workload using multiple processing elements (e.g., nodes in a multi-node cluster) in order to obtain results faster. A parallel workload can be split up to be executed a piece at a time on many different nodes, and then put back together again at the end to get a data processing result.
Executing multi-node applications can lead itself to node variability even with its relative homogeneity. A task-based programming model aims to map tasks to different nodes, and many modern programming frameworks (e.g., Legion, HADOOP®) utilize the concept of a mapper to help load balance the system. These frameworks are based on the single program multiple data (SPMD) paradigm where one single program (i.e., task) runs in multiple nodes operating on different data. Existing job schedulers and task mappers that map or schedule tasks onto nodes do not take into account variations among the nodes. The existing job schedulers and task mappers assume nodes are homogeneous for the same product (server or processor) during task scheduling, leading to poor choice of nodes and sub-optimal performance and power consumption.