Cloud technology infrastructures are positioned to fundamentally change the economics of computing by offering several advantages over traditional enterprise technology infrastructures. First, technology infrastructure clouds provide practically unlimited infrastructure capacity (e.g., computing servers, storage) on demand. This capability is especially valuable for hosting high-performance web applications. Because certain web applications can have a dramatic difference between their peak loads and their normal loads, a traditional infrastructure may be ill-suited to support them. Traditional enterprise infrastructures would either grossly over-provision to account for the potential peak (thereby wasting valuable capital during off-peak hours), or, alternatively, provision according to “normal” loads and be unable to handle peak loads when they materialize. Instead of grossly over or under provisioning upfront due to uncertain or unpredictable demands, users can elastically provision infrastructure resources from the cloud provider's pool only when needed. Second, the pay-per-use model allows users to pay for the actual consumption instead of for estimated peak capacity. Third, a cloud data center infrastructure is much larger than most enterprise data centers. The economy of scale, both in terms of hardware procurement and infrastructure management and maintenance, helps to drive down the cost of the technology infrastructure further.
In a typical arrangement, an application owner may choose to host or operate a portion or all of an application in a web server farm in a cloud data center. Although a cloud data center can offer strong value propositions, it is very different from a traditional enterprise infrastructures, and migrating an application from a traditional enterprise infrastructure to a cloud data center is not trivial. Many performance optimization techniques rely on the choice and control of the specific infrastructure components. For example, to scale a web application, application owners typically either ask for a hardware load balancer or ask for the ability to assign the same IP address to all web servers to achieve load balancing. Unfortunately, neither option is available through popular cloud computing vendors. To take full advantage of a cloud data center, the application owners typically must understand the capabilities and limitations of individual cloud components, and may be required to re-design and re-architect their application to be cloud-friendly.
In traditional enterprises, application owners can choose an optimal infrastructure for their applications amongst various options from various technology hardware and software vendors. In contrast, cloud data centers are owned and maintained by the cloud providers. Due to the typical commodity business model, cloud computing providers generally offer only a limited set of infrastructure components. Commonly, these components are limited to: (virtualized) web servers, dedicated web application engines, and data storage/static hosts. However, each of these cloud components has significant limitations. For example, a typical virtual machine offered through a cloud infrastructure may have only limited types of virtual servers available. Furthermore, these servers will generally operate according to specifications that cannot be customized by or for the applications executing on the servers. In addition, application owners have little or no control of the underlying infrastructure. Moreover, for security reasons, many cloud infrastructure vendors disable several networking layer features conventionally available through enterprise technology infrastructures. These features include Address Resolution Protocol (“ARP”), “promiscuous mode,” “IP spoofing,” and IP multicast. Application owners have no ability to change these infrastructure decisions in a cloud infrastructure, and thus are incapable of benefiting from these features.
Another cloud component commonly offered by a cloud provider and used for hosting web presences is a storage host. Although highly scalable, typical cloud-offered storage hosts also have a few limitations. Specifically, these storage host components are able to host only static content. Often, in order to use a storage host as a web hosting platform, a cloud user can only access the non-SSL end point of the platform, thus compromising the security of the interaction. Typical dedicated application engines offered by popular cloud technology infrastructure providers also suffer from significant flaws. For example, one popular application engine is limited to supporting only a few programming languages. In addition, these application engines may have performance limits. For example, incoming and outgoing requests may be limited in size to 10 MB per request. Naturally, these limitations can negatively affect a user experience by prohibiting a significant portion of a user's desired service.
As discussed above, the commonly used cloud components have significant limitations. In particular, when used alone, none of the components are able to host a high traffic web presence (for example, web presences exceeding 800 Mbps of aggregate in and out traffic). One potential approach to the need for greater scaling is to use a traditional Domain Name System (DNS) load balancing technique to scale beyond the limitations (e.g., above 800 Mpbs). During conventional web address navigation techniques, when a user browses to a domain, the browser first asks its local DNS server for the IP address corresponding to the domain. Once the address is received, the browser contacts the IP address directly. In cases where the local DNS server does not have the IP address information for the requested domain, the local DNS server will contact other DNS servers that may have the information. Eventually, the request will percolate to the original DNS server that the web server farm (corresponding to the domain) directly manages. According to one conventional technique, the original DNS server can hand out different IP addresses (but directing to the same web application) to different requesting DNS servers so that the load could be distributed out among the servers sitting at each IP address.
Unfortunately, DNS load balancing also has drawbacks of its own: a lack of load balancing granularity and adaptability. First, DNS load balancing typically does a poor job of actually balancing the load amongst different web servers. For performance reasons, a local DNS server caches the IP address information. Thus, all browsers contacting the same DNS server would get the same IP address. Due to disparities in regional population and usage demographics, certain DNS servers could be responsible for a significantly larger number of hosts (browsers) relative to other DNS servers, and thus loads may not be effectively distributed between all DNS servers.
Second, traditional DNS load balancing techniques also lack adaptability. Since local DNS servers cache IP address information for a set period of time, i.e. for days, until the cache expires, a local DNS server will guide requests from browsers to the same web server. When traffic fluctuates at a time at a scale smaller than days, tweaking DNS server settings will have little effect on load balancing. Traditionally, this drawback was not so critical because the number of back-end web servers and their IP addresses were static anyway. However, where the number of requisitioned components is dynamically scalable according to consumption, the scalability of a cloud-based web server farm may be seriously influenced. A cloud-based web server farm elastically changes the number of web servers tracking the size of traffic in minute granularity. DNS caching persisting over days (as per traditional DNS schemes) dramatically reduces this elasticity. For example, even if a web server farm in a cloud data center increases the number of web servers serving at peak load, IP addresses for new web servers will not be propagated to DNS servers that already have cached IP addresses. Therefore, the requests from hosts relying on those DNS servers may be sent to long-requisitioned web servers and possibly overloading them while newly requisitioned web servers remain idle.
Other load balancing techniques include both dedicated hardware load balancers and load balancing software. Dedicated hardware load balancers intercept data packets en route to a web server and distribute the packets among web servers according to some packet balancing scheme (typically round-robin), ideally effecting a corresponding balance of load. Unfortunately, hardware load balancers are not typically available as cloud computing components. Load balancing software implementations are typically executed from a non-dedicated platform (e.g., a virtual server) and perform similarly to hardware load balancers. However, since software load balancing implementations do not process traffic, but rather, only forward the data packets to the appropriate web servers, for each incoming (and outgoing) packet, the load balancer must first receive the data from a host (browser) and subsequently forwards to a web server. This interaction results in a doubling of the network bandwidth consumption for each individual packet transfer (and, thereby halving the network throughput). Moreover, the scalability of a software load balancing solution may also be limited by a cloud vendor's security restrictions, since traditional techniques used to scale software load balancers require features which may be prohibited by a cloud vendor on a cloud computing infrastructure component.
Migrating existing enterprise applications to a cloud is not as simple as measuring the capabilities of cloud components and identifying their limitations. First, the performance of individual web servers offered through cloud vendors can be an entire order of magnitude lower than available state-of-the-art web server hardware solutions. Second, the traditional load balancing-based approach may not work on public cloud environments because the network bandwidth available to the software load balancer is limited. Furthermore, load balancing is not typically scalable and, even when implemented, often incurs many severe limitations, such as very limited program language support and limits on the size of transferred files. Also, static content hosts which are superior for scalability but, regrettably, are only viable options for static content.