The capacity and utility of coprocessors, such as hardware accelerators, are increasing. Graphics, cryptography, mathematics, and streaming are important applications of coprocessors. Fabrication innovations such as a system on a chip (SoC) and a three-dimensional integrated circuit (3D IC) make coprocessors even more attractive, because coprocessors can be bundled with a central processing unit (CPU).
However, coprocessors may introduce coordination problems. Traditional ways of orchestrating coprocessors include interrupts, polling, and centralized control. For example in the case of multicore symmetric multiprocessing (SMP), coordination mechanisms may include a core-to-core mailbox, a semaphore that is implemented in memory or a register, or by using an interrupt that is initiated by one core and directed to another core.
Centralized control of coprocessors by a CPU may impact system throughput by sub-optimally idling either the CPU or the coprocessor. An interrupt may complicate the enforcement of a critical section of an application, may compromise the atomicity of a unit of work, and may cause priority inversion. Polling is notoriously inefficient in some scenarios.
Centralized control also prevents multiple coprocessors from coordinating directly with each other as peers. As such, the interaction patterns that are supported by traditional co-processing models do not offer the flexibility that is available to general purpose software.
Suboptimal coordination of coprocessors can cause idling (dark silicon). Dark silicon may increase the unit cost of fabrication as amortized over a given amount of computation. For example, a computation that uses only half of the available coprocessors at a given moment might still need to budget for the cost of fabricating all of the available coprocessors. Underutilization may also increase latency and decrease system throughput.