Field of the Invention
This invention relates generally to the field of computer processors. More particularly, the invention relates to a generic, extensible instruction for low-latency invocation of accelerators.
Description of the Related Art
Invoking accelerators today requires going through a driver interface. In a system in which a hierarchical protection domain is used, this means switching to ring 0 and copying data to a different address space, which consumes significant time and processing resources. Due to the high latency, such accelerator interfaces are also inherently asynchronous. Programmable accelerators require the accelerated code to be implemented in their own instruction set architecture (ISA).
Some current processor architectures attempt to address some of these concerns but provide only a coarse-grained asynchronous mechanism with a high latency between the accelerated task request and its execution. In addition, current architectures use a non-X86 ISA, which requires a separate toolchain to generate and integrate the accelerated task with the main x86 program.
In addition, current asynchronous hardware accelerators (e.g., GPUs) allow the accelerated task to execute unrelated to the application thread that triggered it. This allows the application thread to handle exceptions and/or interrupts without affecting the accelerated task, and even allow the application thread to migrate between cores without impacting the accelerated task location on the system.
Current synchronous hardware accelerators need to ensure that interrupts, exceptions, context switches and core migrations are still functionally correct and ensure forward progress. This is done either by (1) ensuring the accelerator is short enough and doesn't cause any exceptions, so that any interrupts are deferred until the accelerator is done; (2) maintaining the accelerator's forward progress in existing architectural registers (e.g., REPMOV); or (3) defining new architectural registers to hold the accelerator status, and adding them to XSAVE/XRESTORE.