In typical computer systems, input/output (IO) devices, especially networking devices, use contexts or context data in order to enable working on requests from different users. Since there may be a very large number of contexts supported by a device, only a small number is actually cached in the device, the others remain in backing store and are only fetched when needed.
Known computer systems use virtualization support in parts of the system, for example, the processing unit (PU), the IO controller (IOC), and external switches and devices. IO devices use context data within the virtualized device to provide virtualization support. Each interface to a device, and therefore the device-virtualization, is enabled by storing the information needed for each interface in separate contexts. A mechanism that can be used without special support in the processor, and which relies only on the system supervisor or hypervisor, is referred to herein as software (SW) virtualization. Separation of different virtual interfaces from each other is achieved using virtual address support in a memory management unit. The IO device provides a number of virtual interfaces on contiguous pages in a real address space, one virtual interface having at least the size of the smallest page size of the system. The supervisor or hypervisor can map each interface into the virtual address space of a user. As many virtual interfaces as needed may be mapped into the virtual address space of the same user. The virtual interface can not only provide a doorbell-mechanism/a doorbell register, but may also allow access to configuration registers which may be accessed by the user.
In a single-root host-virtualized system, a device can offer a number of virtual functions. The virtualized system is similar to the system used for software-virtualization, but has hardware-support both in the IO controller and the IO device. Therefore, a device can offer a number of interfaces to the user in the form of virtual functions. A virtual function is derived from a physical function, that is, virtual functions provide the same functionality as the physical function. However, virtual functions have their own configuration in the IO controller and the IO device and are therefore protected from the other virtual functions of the same type.
Another example of virtualization is multi-root virtualization. Using this technique, different peripheral component interconnect (PCI) root-complexes can use the same IO device. Multi-root virtualization is implemented using mechanisms in the switch(es) connecting the IO devices with the root complexes and the IO device itself. It is therefore invisible to the root complexes.
Handling context data in a device is important, especially if the contexts are relatively large, for example, several cache lines. Without the appropriate context data, the IO device cannot start the real processing task. Therefore, context misses, resulting in a context-miss latency, cannot be hidden by other processing tasks for the request. Context cache miss penalties in terms of latency can be as high as the packet latency itself. Increased numbers of cores and users accessing the device result in an increasing probability of context misses in the device. The latency impact will most probably increase in multi-root environments which frequently feature more than one switch between the root complex and the device. Furthermore, context cache misses are difficult to handle in hardware and necessitate queues in the IO device in order to avoid creating back pressure into the processor.
Virtualization and its scalability is achieved using context data, and support for every virtual function/interface necessitates a separate context. In many cases, the number of contexts or context data is too large to fit into the cache of an IO device. Context data not in use is stored in memory attached to the IO device or in the system main memory, the so-called backing store. The use of system memory causes the above described long fetch-latencies (delays) for context data which cannot be hidden efficiently. Further, attaching memory to the device is expensive, and can be cost prohibitive.
Context misses can have a high impact on the cumulative delay times in a system, for example in network packet processing. IO device context data is typically stored in main memory, as opposed to data used for processing which may reside in a system cache prior to being requested by the device. Thus, context data is typically subject to long memory latency (or time delays). As a result, context misses may inflict the longest latency during packet processing in the system.
It would therefore be desirable to resolve context-misses as fast as possible. It is further desirable to facilitate operation in an IO device by sufficiently delaying a doorbell until the context-miss is resolved.