A spinlock is a way to protect a shared resource from being simultaneously modified by two or more processes (e.g., application threads). The first process that tries to modify the resource “acquires” the spinlock and continues executing, utilizing the resource as needed. Any other processes that subsequently try to acquire the spinlock get stopped; they are said to “spin in place” waiting on the spinlock to be released by the first process.
Operating system systems (e.g., Linux) kernels can use spinlocks when sending data to a particular peripheral. Most hardware peripherals typically cannot handle multiple simultaneous state updates. Therefore, if two different modifications to a given state have to happen, one has to strictly follow the other, as they cannot overlap. A spinlock provides the necessary protection, ensuring that the modifications happen one at a time.
Peripheral Component Interconnect Express (PCIE) hardware acceleration devices like InfiniBand peripherals, fiber channel peripherals, Non-Volatile Memory Express (NVME) peripherals, and encryption or compression peripherals typically utilize application threads and drivers to define completion event handles that can be waited on by a given application thread and woken by hardware interrupt events. In some configurations a completion queue is maintained in the operating system kernel and is protected by a spinlock.
In operation, an interrupt handler takes the spinlock, adds an entry to the completion queue, and releases the spinlock. The (woken) application thread takes the spinlock with hardware interrupts disabled and removes the entry from the queue, and then releases the spinlock with interrupts enabled. Taking the spinlock is needed to protect the completion queue from corruption when the application threads are reading data from the queue while a new completion interrupt occurs simultaneously. Taking the spinlock (even uncontended) invokes an atomic operation and can severely impact performance on systems with many (e.g., tens) of processors, since atomic operations typically prevent all other processors from accessing the memory bus and a particular cache line. These (i.e., atomic) operations can be particularly expensive (i.e., resource intensive) on non-uniform memory access (NUMA) systems.
The description above is presented as a general overview of related art in this field and should not be construed as an admission that any of the information it contains constitutes prior art against the present patent application.