Traditional computer systems use an operating system (OS) kernel-level storage stack for mediating application access to physical storage devices such as magnetic disks, flash-based disks, and so on. For example, FIG. 1 depicts a computer system 100 comprising a number of application instances (i.e., processes) 102(1)-(N) running in user space 104 and a storage stack 106 running in kernel space 108. As used herein, “user space” refers to the portion of system memory that is dedicated to user processes, whereas “kernel space” refers to the portion of system memory that is dedicated to the OS kernel and kernel extensions/drivers. Storage stack 106, which includes a file system layer 110 and a storage device driver 112, is communicatively coupled with a physical storage device 114.
When a given application instance 102 wishes to issue an Input/Output (I/O) request to storage device 114, the application instance invokes an OS system call that is exposed by storage stack 106. This invocation causes the system CPU handling the system call to execute a context switch from user mode to kernel mode. While the system CPU is in kernel mode, storage stack 106 processes the I/O request by communicating with storage device 114 and generates an appropriate response for the calling application instance. The system CPU then executes another context switch from kernel mode back to user mode so that the calling application instance can receive the response and continue its runtime operation.
One benefit of kernel-level storage stack 106 is that, due to its function as a centralized mediator of I/O requests issued by application instances 102(1)-(N), it can easily implement caching of the data accessed by these multiple application instances in a shared data cache. Such a shared data cache allows for improved I/O performance in scenarios where application instances 102(1)-(N) access overlapping sets of data and enables more efficient cache space usage in comparison to individual, application-specific caches. However, a significant disadvantage of kernel-level storage stack 106 is that it incurs a context switching overhead for each I/O operation as described above, which can degrade the I/O performance of certain application workloads and can potentially bottleneck the I/O performance of future, high-speed storage devices.
To avoid this context switching overhead, there are a number of emerging technologies that enable a feature known as “kernel bypass” (sometimes referred to as “user-level data plane” or “user-level I/O” processing). With kernel bypass, applications can make use of I/O stacks that reside in user space (i.e., within the virtual address spaces of the applications) rather than in kernel space. Thus, kernel bypass effectively offloads I/O handling from the kernel level to the application (i.e., user) level. This allows applications to interact directly with physical I/O devices such as storage devices, network adapters, etc. without kernel involvement, which in turn eliminates the need to perform context switching on a per I/O basis. However, because kernel bypass decentralizes I/O processing, computer systems that implement this feature no longer have a central I/O mediator (like kernel-level storage stack 106 of FIG. 1) that can perform shared data caching across multiple concurrent application instances.