The disclosed embodiments of the present invention relate to a task scheduler design, and more particularly, to a dynamic task scheduling method for dispatching sub-tasks to computing devices of a heterogeneous computing system and a related computer readable medium.
A multi-processor system becomes popular nowadays due to advance of the semiconductor process. Regarding a heterogeneous computing system, it has processors that are not identical. For example, the heterogeneous computing system may include at least one first processor (e.g., one or more central processing units (CPUs)) and at least one second processor (e.g., one or more graphics processing units (GPUs)), where each first processor may have first processor architecture (e.g., first instruction set architecture), and each second processor may have second processor architecture (e.g., second instruction set architecture) that is different from the first processor architecture. Hence, if the same task is running on the first processor and the second processor, instructions executed by the first processor are different from that executed by the second processor.
Several frameworks have been developed to enable programs, each including one or more tasks, running on a heterogeneous computing environment, such as OpenCL (Open Computing Language) and Heterogeneous System Architecture (HSA). Taking OpenCL for example, it is a framework for writing programs that can be executed across heterogeneous platforms consisting of CPUs, GPUs and other processors (e.g., digital signal processors (DSPs)). Specifically, OpenCL is an open standard for parallel programming of heterogeneous computing systems. Typically, computing device(s) of a heterogeneous computing system being selected to run the tasks of an OpenCL program is (are) statically determined by the programmer. Furthermore, in a case of executing a task of the OpenCL program on multiple devices in parallel, the programmer needs to statically partition the task into sub-tasks according to the number of the devices and assign one of the sub-tasks to each device.
However, such a static task scheduler design with static task partitioning could make the heterogeneous computing system have lower throughput, and cannot guarantee load balance of different processors in the heterogeneous computing system.