“Cloud computing” generally refers to computing that occurs in environments with dynamically scalable and often virtualized resources, which typically include networks that remotely provide services to client devices that interact with the remote services. For example, cloud computing environments often employ the concept of virtualization as a preferred paradigm for hosting workloads on any appropriate hardware. The cloud computing model has become increasingly viable for many enterprises for various reasons, including that the cloud infrastructure may permit information technology resources to be treated as utilities that can be automatically provisioned on demand, while also limiting the cost of services to actual resource consumption. Moreover, consumers of resources provided in cloud computing environments can leverage technologies that might otherwise be unavailable. Thus, as cloud computing and cloud storage become more pervasive, many enterprises will find that moving data center to cloud providers can yield economies of scale, among other advantages.
However, while much of the information technology industry moves toward cloud computing and virtualization environments, existing systems tend to fall short in adequately addressing concerns relating to managing or controlling workloads and storage in such environments. For example, cloud computing environments are generally designed to support generic business practices, meaning that individuals and organizations typically lack the ability to change many aspects of the platform. Moreover, concerns regarding performance, latency, reliability, and security present significant challenges, as outages and downtime can lead to lost business opportunities and decreased productivity, while the generic platform may present governance, risk, and compliance concerns. In other words, once organizations deploy workloads beyond the boundaries of their data centers, lack of visibility into the computing environment may result in significant management problems.
While these types of problems tend to be pervasive in cloud computing and virtualization environments due to the lack of transparency, existing systems for managing and controlling workloads that are physically deployed and/or locally deployed in home data centers tend to suffer from many similar problems. In particular, information technology has traditionally been managed in silos of automation, which are often disconnected from one another. For example, help desk systems typically involve a customer submitting a trouble ticket to a remedy system, with a human operator then using various tools to address the problem and close the ticket, while monitoring systems that watch the infrastructure to remediate problems may remain isolated from the interaction between the customer and the help desk despite such interaction being relevant to the monitoring system's function.
As such, because existing systems for managing infrastructure workloads operate within distinct silos that typically do not communicate with one another, context that has been exchanged between two entities can often be lost when the workload moves to the next step in the chain. When issues surrounding workload management are considered in the context of business objectives, wherein information technology processes and business issues collectively drive transitions from one silo to another, modern business tends to move at a speed that outpaces information technology's ability to serve business needs. Although emerging trends in virtualization, cloud computing, appliances, and other models for delivering services have the potential to allow information technology to catch up with the speed of business, many businesses lack the knowledge needed to intelligently implement these new technologies.
For example, emerging service delivery models often lead to deployed services being composed and aggregated in new and unexpected ways. In particular, rather than designing and modeling systems from the ground up, new functionality is often generated on-the-fly with complex building blocks that tend to include various services and applications that have traditionally been isolated and stand-alone. As such, even though many emerging service delivery models provide administrators and users with a wider range of information technology choices than have ever before been available, the diversity in technology often compounds business problems and increases the demand for an agile infrastructure. Thus, despite the advantages and promise that new service delivery models can offer businesses, existing systems tend to fall short in providing information technology tools that can inform businesses on how to intelligently implement an information technology infrastructure in a manner that best leverage available technology to suit the particular needs of a business.
Furthermore, although emerging service delivery models offer various ways to provide services that can be hosted in remote data centers, including virtualized or cloud computing environments, managing such services with existing systems tends to be a burdensome and cumbersome process. For example, existing systems typically host services in cloud computing environments within virtual machines that run over abstracted physical environments, wherein the virtual machines typically include a root file system (or “machine image”) provided by an entity that deploys the service in the cloud computing environment in addition to a kernel and an initial ramdisk chosen from various modules available from a provider of the cloud computing environment (e.g., Amazon EC2). Thus, because the kernel included in the virtual machines typically needs certain modules to boot successfully, the root file system in the virtual machine typically requires certain modules that support the particular kernel chosen from the cloud computing provider. Consequently, upgrading the kernel contained in a virtual machine hosted in an existing cloud computing environment typically requires recreating or rebuilding the root file system to include the specific modules needed to support the upgraded kernel.
However, in many instances, recreating the root file system to include the specific modules needed to support the upgraded kernel can cause unnecessary downtime, lost productivity, or other negative consequences for services that the virtual machines provide. In particular, because updates to virtual machines that have been deployed in existing cloud computing environments are typically applied either during the boot process (i.e., subsequent to loading the kernel) or in an entirely new build of the master machine image, the virtual machine will either have to reload the kernel to incorporate the updates applied during the boot process or rebooted entirely with the new build of the master machine image. Although certain techniques have been proposed to load a new kernel in a running virtual machine instance, these techniques tend to fall short in adequately addressing security or stability concerns with switching the kernel in the running instances. For example, the kernel execution (or kexec) mechanism in the Linux kernel allows a new kernel to be loaded over a currently running kernel. However, kexec requires loading the new kernel on a booted virtual machine and then rebooting the virtual machine with the new kernel, which can leave a window of vulnerability prior to the reboot, especially in contexts where the original kernel has security issues. Furthermore, running kexec typically results in the new kernel overwriting memory for the current kernel even though that kernel may still be running, which can cause substantial stability concerns prior to the reboot.
Accordingly, in view of the foregoing, existing systems tend to lack mechanisms that can suitably upgrade, switch, or otherwise modify the kernels in existing virtual machines running in cloud computing environments without compromising security, stability, or other concerns.