Hardware acceleration involves the use of hardware to perform some functions more efficiently than software executing on a general-purpose CPU. A hardware accelerator is special-purpose hardware designed to implement hardware acceleration for some application. Example applications include neural networks, video encoding, decoding, transcoding, etc., network data processing, and the Ike, Software executing on the computing system interacts with the hardware accelerator through various drivers and libraries. One type of hardware accelerator includes a programmable device and associated circuitry. For example, the programmable device can be a field programmable gate array (FPGA) or a system-on-chip (SOC) that includes FPGA programmable logic among other subsystems, such as a processing system, data processing engine (DPE) array, network-on-chip (NOC), and the Ike.
In multiprocessing systems, thread synchronization can be achieved by mutex lock to avoid race conditions. Use of mutexes is common in software environments, where mutual exclusion of shared data is achieved via atomic operations. Protocols such as Peripheral Component Interface Express (PCIe) and Cache Coherent Interconnect for Accelerators (CCIX) also provide support for atomic operations, which enables hardware acceleration kernels to obtain locks and compete with software threads. For systems that have multiple acceleration kernels operating in parallel, lock requests to the host computer system by the acceleration kernels can lead to unnecessary peripheral bus utilization and increased contention handling by the host computer. There is a need for a more efficient technique for handling access to shared data by multiple acceleration kernels in a hardware acceleration system.