In certain computing scenarios where multiple user processes access a shared storage volume, it may be desirable to implement an “arbitrator” user process that receives input/output (I/O) requests issued by the other user processes and executes the requests against the shared storage volume on behalf of those processes. For example, consider a scenario where the shared storage volume functions as a cache for data that is read and written by the user processes. In this case, the arbitrator process can track and prioritize the I/O requests received from the various user processes in order to, e.g., create/update cache mappings, implement cache eviction policies, and so on.
In a conventional I/O workflow involving a user process A and an arbitrator process B as described above, process A typically sends an I/O request (e.g., a read request) to process B, which forwards the read request to a kernel I/O stack that interfaces with the target storage volume. Upon execution of the read request at the storage tier, process B receives the read data from the kernel I/O stack and stores the data in a memory buffer that is local to (i.e., is in the memory space of) process B. Finally, process B notifies process A that the read request has been completed, and the read data is copied from the memory buffer in process B to another memory buffer that is local to process A for consumption by process A.
While the foregoing workflow is functional, it is also inefficient because it requires a copy of the read data intended for process A to be held, at least temporarily, in the memory space of process B. It is possible to work around this inefficiency by mapping a portion of process A's memory into process B via conventional memory mapping techniques. However, this can become impractical if process B needs to perform/arbitrate I/O on behalf of a large number of other processes. For example, process B may run out of memory address space before it is able to map shared memory segments for all such processes.