The term “virtualization” has taken on many meanings in the domain of computers and operating systems as well as in storage and networking domains. Hardware (e.g., CPUs and peripherals) can be virtualized so as to “hide” the details of how to interface with the hardware from a user by adding a layer of software (e.g., an operating system). Likewise, an operating system can be virtualized so as to “hide” the details how to interface with the operating system by adding a layer of software (e.g., a hypervisor). Users can write code to perform some functions without a strong reliance on the underlying infrastructure such as a particular operating system and/or a particular vendor and/or a particular configuration of hardware.
Further, details pertaining to interfacing with underlying storage facilities and networking configurations can be abstracted by providing a specially configured “control” virtual machine (see below), and users can write code that runs in another “user” virtual machine. Such abstractions are a boon to code developers and system administrators alike, and very large virtualized systems comprising many hundreds or thousands of nodes and many hundreds or thousands (or millions) of user virtual machines can be configured and managed by an operator who interfaces with a configuration panel to configure said hundreds or thousands (or millions) of virtual machines.
In a virtualized system, it is sometimes convenient for a developer to deploy some set of functions using units called “containers”. A container can be configured to implement a particular function without reliance of a fully-configured hardware and/or software platform. For example, a container might be defined to perform some simple operation over some inputs and produce an output. In such a case, the container might be very lightweight, requiring only a way to receive the inputs, a way to perform the simple operation, and a way to provide the output. The “weight” of a hypervisor and/or an operating system is unnecessary in this case. In some cases a container might be defined to provide a somewhat more complex service, in which case the developer of the container might choose to bring some small portion of an operating system or hypervisor into the container. In such a case, the resulting container can still be lightweight vis-à-vis the alternative of bringing in the entire operating system or hypervisor. In still more situations, a group of containers might be defined and developed in such a manner that the group of containers performs as an “application”. This paradigm can be extended to include many hundreds or thousands (or millions) of containers.
Virtualization Using Virtual Machines
A “virtual machine” or a “VM” refers to a specific software-based implementation of a machine in a virtualization environment in which the hardware resources of a real computer (e.g., CPU, memory, etc.) are virtualized or transformed into the underlying support for the fully functional virtual machine that can run its own operating system and applications on the underlying physical resources just like a real computer. Virtualization works by inserting a thin layer of software directly on the computer hardware or on a host operating system. This layer of software contains a virtual machine monitor or “hypervisor” that allocates hardware resources dynamically and transparently. Multiple operating systems run concurrently on a single physical computer and share hardware resources with each other.
Virtualization Using Container-Based Virtualization
Recently, container-based virtualization technologies have grown in popularity. In comparison to virtual machines, which mimic independent physical machines by creating a virtual machine that runs on top of a host's operating system, containers virtualize the applications that can run in user-space directly on an operating system's kernel. Applications, such as a web server or database that run from within a container, do not require an emulation layer or a hypervisor layer to interface with the physical machine. Instead, “containerized” applications can function using an operating system's normal system calls. In this way, containers provide operating system-level virtualization that is generally faster (e.g., faster to transport, faster to “boot” or load) than virtual machines because the containers do not require virtualized guest OSes.
One reason for the broad adoption of virtualization technologies such as virtual machines or containers is the resource advantages provided by the virtual architectures. Without virtualization, if a physical machine is limited to a single dedicated operating system, then during periods of inactivity by the dedicated operating system the physical machine is not used to perform useful work. This is wasteful and inefficient if there are users on other physical machines that are currently waiting for computing resources. In contrast, virtualization allows multiple virtualized computers (e.g., VMs, containers) to share the underlying physical resources so that during periods of inactivity by one virtualized computer, another virtualized computer can take advantage of the resource availability to process workloads. This can produce great efficiencies for the use of physical devices, and can result in reduced redundancies and better resource cost management.
Data centers are often architected as diskless computers (“application servers”) that communicate with a set of networked storage appliances (“storage servers”) via a network, such as a fiber channel or Ethernet network. A storage server exposes volumes that are mounted by the application servers for their storage needs. If the storage server is a block-based server, it exposes a set of volumes by logical unit numbers (LUNs). If, on the other hand, a storage server is file-based, it exposes a set of volumes called file systems.
While generally more lightweight than VMs, containers that are improperly secured can provide malicious access (e.g., root access) to a physical host computer running the containers. Further, container technologies currently do not provide a means for storage optimizations to occur in the primary storage path. Generally, containers are integrated directly with the operating system (OS) to work with the kernel using system calls. Optimizing storage for containers can require heavy OS customization. Compounding these issues and problems endemic to container technologies, deployment of containers in virtualized environments brings a raft of hitherto unaddressed problems.
Unfortunately, legacy techniques to integrate and control containers in a virtualized environment have fallen short. Indeed, although containers can be deployed and managed in rudimentary ways using legacy tools, such legacy tools fall far short of providing the comprehensive set of configuration, deployment, and publishing features that are demanded in hyper-converged platforms. What is needed is a way for one, or tens, or hundreds, or thousands, or millions of containers to be deployed and controlled in a virtualized environment.
What is needed is a technique or techniques to improve over legacy and/or over other considered approaches. Some of the approaches described in this background section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.