Many companies and other organizations operate computer networks that interconnect numerous computing systems to support their operations, such as with the computing systems being co-located (e.g., as part of a local network) or instead located in multiple distinct geographical locations (e.g., connected via one or more private or public intermediate networks). For example, data centers housing significant numbers of interconnected computing systems have become commonplace, such as private data centers that are operated by and on behalf of a single organization, and public data centers that are operated by entities as businesses to provide computing resources to customers or clients. Some public data center operators provide network access, power, and secure installation facilities for hardware owned by various clients, while other public data center operators provide “full service” facilities that also include hardware resources made available for use by their clients. However, as the scale and scope of typical data centers has increased, the tasks of provisioning, administering, and managing the physical computing resources have become increasingly complicated.
The advent of virtualization technologies for commodity hardware has provided benefits with respect to managing large-scale computing resources for many clients with diverse needs, allowing various computing resources to be efficiently and securely shared by multiple clients. For example, virtualization technologies may allow a single physical computing machine to be shared among multiple users by providing each user with one or more virtual machines hosted by the single physical computing machine, with each such virtual machine being a software simulation acting as a distinct logical computing system that provides users with the illusion that they are the sole operators and administrators of a given hardware computing resource, while also providing application isolation and security among the various virtual machines. Furthermore, some virtualization technologies are capable of providing virtual resources that span two or more physical resources, such as a single virtual machine with multiple virtual processors that spans multiple distinct physical computing systems. With virtualization, the single physical computing device can create, maintain or delete virtual machines in a dynamic manner. In turn, users can request computer resources from a data center and be provided with varying numbers of virtual machine resources on an “as needed” basis or at least on an “as requested” basis.
Today, a common way to implement virtualization for peripheral devices is to run a process in a virtual machine (or hypervisor) on the main server cores of the system on which other virtual machines are running on behalf of guests. The process traps all of the accesses to the virtual hardware for the peripheral devices and then emulates those devices in software. In some cases, with this approach, the software that is responsible for emulating the peripheral devices can sometimes cause jitter and variability in performance for the guests that are running on the same machine. In addition, for an infrastructure provider that implements this approach, the processing capacity of the processor cores that are running the emulation software is not available for sale or lease to customers.
Many peripheral devices are compliant to the PCI Express (Peripheral Component Interconnect Express) bus standard. PCI Express (also referred to as PCIe) is a high-speed serial computer expansion bus standard, some versions of which support hardware I/O virtualization. In general, a PCIe bus supports full-duplex communication between any two endpoints, with data encapsulated in packets. Traditionally a PCIe endpoint presents a single PCIe interface to the host. Typically, when the PCIe endpoint is connected to a multi-socket host server, it is physically connected to only one of the processor sockets (e.g., through one PCIe expansion slot). In this case, the socket that is not directly connected to the PCIe endpoint must relay PCIe traffic to the PCIe endpoint through the socket that is directly connected to the PCIe endpoint socket. This can increase latency and jitter (e.g., non-determinism of latency) due to the dynamic queuing effects of various links and buffers in the relay path.
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that the embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include,” “including,” and “includes” mean including, but not limited to.