1. Field of the Invention
The present invention is generally related to networked virtual computer systems and in particular to an architecture and methods of providing TCP/IP offload engine support in virtual computer systems.
2. Description of the Related Art
Virtual computer systems are conventionally recognized as providing a variety of practical benefits, including more efficient use of hardware resources, improved opportunity for security and management control over executing applications, and the ability to support multiple discrete if not wholly independent execution environments. Consequently, interest in the architectural development of virtual computer systems, particularly in the areas of supporting new, performance-enhancing hardware, and extending those performance enhancements to the individual execution environments, is substantial and ongoing.
In summary, virtual computer systems are typically based on a conventional hardware platform providing one or more central processing units, a main memory, various persistent storage devices, and one or more network interface controllers (NICs), potentially of different design and functional capabilities. The hardware platform is used to support execution of a typically dedicated operating system kernel that, in turn, implements various virtualization drivers and services that enable multiple virtualization environments to be executed under the control of the virtualization kernel. A conventional host computer operating system can, in the alternative, be employed in place of the dedicated operating system kernel.
The virtualization environments supported by the kernel may be fully isolated execution spaces that, in turn, each encapsulate a network operating system instance and application program execution space. Each virtualization environment represents a discrete virtual machine (VM), and, as such, is often referred to as a guest computer system. Applications executed by the guest computer systems and their respective included guest operating systems are presented with the appearance and, in select circumstances, the fact of directly executing on the hardware platform. While vendors provide operating system drivers for the different, assembled components of the hardware platform, these drivers typically do not incorporate specific support for, or are capable of handling the complications arising from, potentially concurrent use by applications executing in multiple, independent virtualization environments. Therefore, the virtualization kernel is responsible for and generally implements the controls for coordinating access to the shared resources of the underlying hardware platform.
TCP/IP offload engines (TOEs) have been developed to improve the network access performance of computer systems in general. As the supported Ethernet network transmission speeds have increased to 1 Gbps and beyond, execution of the TCP/IP stack purely as a software component can impose a significant burden on the main central processing unit and restrict the actual network data throughput obtainable. TOEs typically implement a hardware TCP/IP protocol stack in combination with a hardware NIC as a platform pluggable hardware adapter. Recent generations of TOEs are nominally capable of supporting session establishment and a significant degree of error-handling services independent of the main central processing unit. Characteristically, however, TOE implementations must rely on a standard software TCP/IP stack, as implemented in a conventional operating system, as a fall-back to handle operating conditions—specifically complex protocol and error conditions—that are otherwise beyond the nominal capabilities of the particular TOE hardware implementation.
A variety of TOE-to-software TCP/IP stack interfaces are known to exist. In most cases, TOE vendors provide proprietary drivers and operating system service modules that will enable a specific TOE adapter to be utilized by a conventional, network-capable operating system, as typified by the major Linux® and Microsoft® operating system variants. Additionally, Microsoft has proposed a defined API, code-named Chimney, to support and define the fall-back coupling between a TOE and operating system kernel-based software TCP/IP stack. See, Scalable Networking: Network Protocol Offload—Introducing TCP Chimney, www.microsoft.com/whdc/device/network/TCP_Chimney. In all, the TOE drivers and service modules enable common network connections and data flows to be conducted through the TOE between an offload target, typically the TOE embedded NIC, and a transport driver, socket, or equivalent layer interface. Where the connection setup and data transport are without exception, utilization of the main central processing unit is minimal. Whenever a unhandleable TOE exception occurs, a protocol object representing the state of the connection and any in-transit data is transferred from the TOE hardware to a corresponding layer level within the associated software TCP/IP stack. This effectively transfers the exception condition to the full software stack for handling and recovery.
Conventional TOE driver and associated service module support is difficult in the context of virtual computer systems. While the more recent TOE implementations are capable of independently handling a wide range of protocol conditions and exceptions, the TOE functions must still be closely coordinated with and backed by a full capability software-based network stack. While the individual guest computer systems typically implement a full network stack as part of the guest operating system, there are practical performance constraints that limit use of these stacks in support of TOE implementations. “TCP/IP Offloading for Virtual Machines,” U.S. patent application Ser. No. 10/741,244, which is assigned to the assignee of the present application and hereby expressly incorporated by reference, describes an effective approach to supporting TOE adapters in a virtual computer system. There, each TOE implementation provided as part of the hardware platform is supported by a virtualization kernel-based network stack. In turn, each guest computer system implements a guest network stack bypass that enables direct communications with an assigned TOE implementation and virtual kernel stack. A common socket connection space is defined for the guest stack, assigned virtualization kernel stack and TOE implementation to establish and ensure the path integrity of network session connections. This defined relation is effectively required by the fact that a conventional TOE implementation cannot multiplex between the separate socket spaces that would need to be presented to different guest computer systems.
Although fully functional, the system described in “TCP/IP Offloading for Virtual Machines” may not make optimum use of the TOE adapters provided as part of the hardware platform, particularly subject to dynamically changing operating conditions. Some guest computer systems may require only a fraction of the bandwidth provided by an assigned TOE implementation while others would be best served in a virtualization environment that supports dynamic aggregation of multiple TOE implementations. Consequently, there is a need for a TOE virtualization system and methods of integrating one or more TOE adapters into a virtual computer system that enables concurrent use of TOE implementations by multiple guest computer systems and, further, adaptability for dynamically changing operating conditions.