Field
The disclosure is generally related to techniques for mapping device addresses to physical memory addresses.
Related Art
Information technology (IT) systems frequently include one or more peripheral component interconnect (PCI) devices, e.g., PCI-express devices. Typical PCI devices are external peripheral devices, e.g., external storage devices, network devices, sound devices, etc. PCI devices access physical memory, usually implemented as random access memory (RAM), of an associated IT system using direct memory access (DMA), which provides an efficient way for accessing the memory. The main storage is administered by an operating system (OS) and explicitly provided to be used by the PCI devices.
In order to prevent a PCI device from accessing physical memory that is not assigned to the PCI device, PCI devices typically employ independent PCI addresses. An input/output memory management unit (IOMMU) may be provided for translating PCI addresses into addresses that refer to physical memory. An IOMMU can be implemented on each PCI device or as a central part of an IT system. PCI addresses may be identical to the addresses of underlying physical memory. PCI address space may be provided as a copy of physical memory or an abstraction layer may be provided for protecting main storage from unauthorized access. Typically, a PCI device provides internal device addresses that are used by applications or libraries using the PCI device, such that further address translation is required.
The operation of the IOMMU in address translation involves functionality of the operating system, which selects the physical memory to be used by the PCI device. If a hypervisor is running on the IT system, the hypervisor may also be involved in the IOMMU functionality. An IOMMU may be further configured to perform a plausibility check to restrict access of a PCI device to memory areas of a physical memory that are reserved for the PCI device. In the event a resource identifier (RID) is transmitted from a PCI device to an IOMMU to uniquely identify the PCI device, the IOMMU may verify that the requested PCI address is assigned to the PCI device. The IOMMU may, for example, be implemented in field programmable gate array (FPGA) or an application specific integrated circuit (ASIC).
According to a first approach, an IOMMU may be centrally implemented in an IT system. In this case, a PCI device first performs a translation of a device address to a PCI address and further transmits a request to access memory (based on the PCI address) to the IOMMU. The IOMMU translates (based on a translation table) the PCI address to a physical memory address and grants access to the physical memory. To reduce the size of the required translation table, physical memory can be provided in blocks of a given size and a translation unit can be employed for translating only a part of a PCI address that identifies a respective memory block. In this case, only the mapping for these blocks has to be done, which reduces the size of the mapping table.
According to a second approach, an IOMMU (e.g., in the form of an FPGA or ASIC) may be fully implemented in each PCI device. Using the second approach, in the case of multiple PCI devices, multiple translation layers are implemented. Typically, according to the first approach, an IOMMU only verifies if a PCI device is allowed to access requested memory and the PCI device provides PCI addresses to directly address physical memory. An access to physical memory that is not assigned to a PCI device is detected by an IOMMU, at which point a system has deactivated the entire PCI device. A PCI device may implement a key that facilitates a plausibility check when translating a device address into a PCI address. The key is used internally in the PCI device to distinguish memory areas of different areas of user memory space of the PCI device.
As previously mentioned, a translation unit can be implemented in the PCI device to reduce a size of a mapping table. For example, the translation of the device address to the PCI address can be accomplished by taking a part of the device address, e.g. the upper 52-bits, as a basis for the PCI address in combination with a table driven scheme. The physical memory address can be formed by taking the PCI address and adding a fixed offset for reading and/or writing data.
A disadvantage of the first and second approaches is that addressing errors are handled centrally on the system, which usually results in deactivation of the entire PCI device in the case of an error. For example, an error can occur when a PCI address does not belong to a PCI device. Furthermore, the implementation of the IOMMU and the translation layer is resource consuming, especially if the PCI device is implemented by using FPGAs for the implementation of the IOMMU and the full translation layer is implemented on the PCI device. Moreover, performance of the system is reduced as the translation scheme is rather complicated and not very efficient.
In conventional IT systems, usually both the first and second approaches are combined which results in double consumption of resources and further reduces the performance of the IT systems due to double translation. Accordingly, other applications running on an FPGA are limited to remaining resources, e.g., on-chip memory of the FPGA.