Heterogeneous computing can be used to split work over multiple heterogeneous processing devices, such as a central processing unit (CPU) and various kinds of accelerators, to reduce processing time and power consumption for the work. Due to variability among the heterogeneous processing devices, balancing the work across the heterogeneous processing devices to achieve a desired efficiency for executing the work is tricky. Heterogeneous processing devices can be assigned segments of work of a larger set of work. Some heterogeneous processing devices can complete the same amount of work or more work in less time than other heterogeneous processing devices. The heterogeneous processing devices that finish the assigned segments of work before the larger set of work is completed can end up waiting until the remaining segments of work being executed by the other heterogeneous processing devices are completed.