Over the last decade, server virtualization has radically transformed application and data center architectures growing from a niche market to a multi-billion dollar industry. By using an abstraction of a virtual machine, virtualization technology decouples the tight binding between the operating system and the physical hardware it runs on. This seemingly simple abstraction provides several advantages in data center design. Fundamentally, virtualization has enabled server consolidation, isolation of application stacks, creation of simplified test infrastructures and containers that ease the maintenance of legacy software, without requiring legacy hardware. More fundamentally, the decoupling of the operating system from physical hardware creates the ability to transparently migrate an operating system with a running application stack between systems in response to load or failure. A number of open and closed source virtualization solutions are available in the market today from e.g., Microsoft (Hyper-V), VMWare (ESX), Oracle (VirtualBox), Citrix (Xen) and RedHat (KVM). Virtualization is also the enabling technology behind cloud computing, and public and private cloud solutions based on virtualization technologies are available from e.g., Amazon (EC2), Microsoft (Azure), Rackspace (Openstack) and VMWare (vCloud).
While cloud computing in the largest infrastructures has now scaled to hundreds and thousands of servers, its true potential is stymied by the inflexibility of current data-center networks. While it is possible to live-migrate virtualized servers without interrupting running applications and network connections, the network configuration is too rigid to be modified adaptively. Multiple routers, switches and end hosts have to be re-configured on the fly. This network rigidity is an artifact of fundamental limitations in TCP/IP protocols that power much of today's networking. More fundamentally, while servers have been virtualized, networking technology has not, resulting in a tight coupling of a virtual machine to the physical network it runs on.
Accordingly, there is a need to eliminate this coupling, thereby enabling network endpoints to move (migrate) independent of the physical networks they run on.
A major limitation of current networks comes from the original design of IP (Internet Protocol) addressing. An IP address is both a unique end point identifier and a location identifier. As an end point identifier, the IP address uniquely identifies one end of a communication pipe. By encoding a network address within the IP address (IP addresses fundamentally have a network component and a host component), the IP address also determines the location of the end point.
This issue becomes evident, for example, when a machine (virtual or physical) needs additional computer resources to handle high application load, but there is no spare capacity on its physical host. To get greater capacity, the machine needs to be migrated to another physical system. While migrating, however, the machine needs to retain its IP address so that currently active network connections would not fail and other hosts that are communicating with the machine can still reach it. Since the IP address also defines how to get (i.e., route) to the host, the machine can migrate only within the same subnetwork and Layer 2 (L2) domain. The crux of the problem is that the IP address defines how to get to the machine, meaning that the machine cannot migrate across subnetworks because there would be no mechanism to route to the new location. This situation can be analogized using names. For example, if a person's name also included his/her zip code (e.g., Jim-22066), and the postal service used the zip code to route mail, then that person could not move across zip codes and still get mail delivery. This is the state of IP addressing today.
Thus, migration capability today is currently limited by the physical network. If there is no spare capacity available in an entire subnetwork, an overloaded machine cannot be scaled-up even if there is spare capacity available in adjacent subnetworks. Another alternative strategy is to design very large subnetworks, such that this issue is avoided in the first place—this is akin to creating very large zip codes in the Jim-22066 analogy above. However, large subnetworks create two issues. First, the most prevalent Layer 2 network—Ethernet—uses spanning trees for routing, which does not scale well. Second, a Layer 2 subnetwork is a single broadcast domain. Larger Layer 2 networks create excessive broadcast traffic. Current approaches to extend a Layer 2 network use virtual LANs (VLANs) to limit broadcast run into address bit limitations in 802.1Q frames. The current IEEE 802.1Q frame allocates 12 bits for a VLAN tag, limiting it to 4096 unique VLANs. This is a limitation for large multi-tenant clouds like Amazon, for example, which can have only 4096 unique subnets. VLANs also incur higher latency and management costs on its switches.
There are multiple competing standards proposals viz. VXLAN, NVGRE, STT that are attempting to solve the network limitation problems discussed above. All of these proposals, however, rely on the same basic idea of encapsulating the Layer 2 frame in an IP (Layer 3 or L3) packet. VXLAN uses UDP, STT uses TCP and NVGRE uses GRE tunnels for the encapsulation. Extending the Jim-22066 analogy from above, if Jim moves to zip code 20190, leading to his new address of Jim-20190, all of these approaches will still refer to Jim as Jim-22066 and then take the mail sent to Jim-22066 and stick it in another envelope (encapsulation) addressed to Jim-20190. At the destination, Jim peels off the outer envelope, finds another envelope addressed to Jim-22066 and uses this envelope to access the mail. Note that in all these approaches, Jim still remains Jim-22066. It is the outer envelope that enables Jim to move, but the fundamental problem where Jim was associated with his zip code still remains.
Furthermore, all three of the above standards approaches suffer from the following major limitations:
1. Encapsulation Overhead: Software encapsulation on hypervisor is expensive and consumes CPU cores, which otherwise can be used for running virtual machines. If hardware encapsulation is used, existing switches in the datacenter will need to be replaced.
2. Hardware upgrades: When the virtual machines communicate with hosts outside the data center, which do not use these protocols, a hardware gateway is needed to transparently introduce and remove encapsulation.
3. Middleware boxes: Since encapsulation changes the wire packet format, existing network middleware boxes like load balancers, intrusion detection systems and firewalls do not work. Since many of these standards proposals are in their infancy, there are no hardware solutions for many of these applications.