General purpose processing devices are able to perform most any type of operation. However, there are many operations for which it is inefficient to have the operation performed by the processing device. Accordingly, such operations may be offloaded by the general purpose processing device to a hardware accelerator, which is a type of special purpose processing device that is configured to perform one or more operations quickly and efficiently. By offloading particular operations to one or more hardware accelerators, energy can be conserved and processing time can be improved.
Conventional solutions for offloading data from a general purpose processing device to a hardware accelerator have certain inefficiencies relating to notifications, data exchange, and data sharing between the general purpose processing device and the hardware accelerator. For example, some cache replacement policies may cause cache lines read by an accelerator to be marked as most-recently used, even though the data will not be used further. Numerous other inefficiencies also exist.