The architecture of most current personal computer (PC) systems, from desktop to server, may be conceptually and schematically illustrated in FIG. 1, to which reference is now made.
PC system 10 typically includes memory 20, which may be embedded within one or more processing units 12, or may be separate therefrom. Processing units 12 are typically coupled with IO devices 14[1]-14[i] via one or more IO buses 16, e.g., peripheral component interconnect (PCI) buses. Optionally, in order to make the connection between processing units 12 and IO devices 14[1]-14[i] quicker, PC system 10 may also include one or more components that communicate with the processing units 12 and control the interaction with memory 20, and the IO buses 16, e.g., a north bridge unit 18.
Processing unit 12 typically includes a Central Processing Unit (CPU) 26 that typically refers to virtual memory addresses in virtual memory address space, which get translated by the memory management unit (MMU) 24 into physical addresses.
Typically, when an IO device uses direct memory access (DMA) operations to write or read data from memory 20, that data is located in physical page frames, and a consumer, e.g., a hypervisor, an OS, or a process using IO having its own virtual memory space, will make use of this data. The consumer typically accesses the data through a virtual address which is translated by the MMU 24 to a physical address.
The translation from virtual address space to physical address space is required to remain stable for as long as the DMA operation is in progress, e.g., it cannot refer to another physical page for as long as the DMA is in progress. If this requirement is violated data corruption may occur. This requirement is referred hereinafter as memory pinning.
Known methods of memory pinning typically communicate physical addresses to the IO device. These methods typically include translation from virtual address to physical address at the upper layers of system 10, e.g., at the OS or the hypervisor (or another component that may manage the operating systems of system 40 and allocate the IO memory space per consumer, not shown in FIG. 1), and pinning the translation. As a result, the IO path is longer and slower, since the involvement of the upper layers is necessary for each DMA operation. Alternatively, the translation is done in advance, at the memory space of the consumer, by pre-registering all of the required physical memory. The drawback of this approach is the “waste” of the physical memory, which is a precious resource, e.g., as long as a page frame is pinned for a given process, no other process can make use of it, even if no IO operation is executed to or from that page frame.