The use of virtualization as a software abstraction of the underlying hardware machine was developed by The IBM Corporation in the 1960s. See: The IBM Mainframe, history and timeline, at. Virtualization refers to the interception of an application's communication with its underlying runtime platforms such as the operating system (OS) or a Java Virtual Machine (JVM). See: The Java Virtual Machine Specification, 2nd Ed., by: Lindholm, T., Yellin, F., Addison-Wesley, Reading, Mass., 2000. Virtualization can be used to give an application the illusion that it is running in the context of its install machine, even though it is executing in the (possibly different) context of a host execution machine.
Conventional full-system virtualization techniques emulate a hardware machine on which an operating system (possibly distinct from that of the host execution machine) can be booted. Full system virtualization incurs a significant performance penalty, and is primarily intended for testing and porting across different operating system platforms. Assuming that installed images are always platform specific (e.g., a Windows/x86 and a Linus/x86 application will each have a separate platform-specific installed image), then much of the host execution machine's operating system and hardware interfaces can be used directly without virtualization. This selective virtualization approach incurs significantly lower performance overhead than full-system virtualization and is practically indistinguishable from direct execution performance.
Full-system virtual machines can be cloned to make new virtual machines. However, the resulting virtual machine images are very large (typically tens of Gigabytes). Another way a new full-system virtual machine can be created from an old one is with a differencing disk which contains the changes that need to be made to the old machine to obtain the new one. Differencing disks are typically smaller than cloned full-system virtual machine images, but still quite large, and the virtual machine image of the old machine is also needed to run the new one.
Virtual machines (VM), particularly those that attempt to capture an entire machine's state, are increasingly being used as vehicles for deploying software, providing predictability and centralized control. The virtual environment provides isolation from the uncontrolled variability of target machines, particularly from potentially conflicting versions of prerequisite software. Skilled personnel assemble a self-contained software universe (potentially including the operating system) with all of the dependencies of an application, or suite of applications, correctly resolved. They then have confidence that this software will exhibit the same behavior on every machine, since a Virtual Machine Monitor (VMM) will be interposed between it and the real machine.
Because software deployment is a relatively new motivation for using virtual machine technology, today's VM-based software deployment efforts employ VMs that were originally designed for other purposes, such as crash protection, low-level debugging, process migration, system archival, or OS development, and are being re-purposed for software deployment.
Many users today require their own virtual machine images which are specific to their own software/computing needs. However, deployment can often be complicated particularly in those instance in which several different applications produced by separate software organizations need to be integrated on the same machine. An example of such a scenario could be a suite such as MySQL/JBOSS/Tomcat/Apache, a Java development tool such as Eclipse, and a J2EE application that needs to be developed using Eclipse and tested on the MySQL/JBOSS/Tomcat/Apache suite. See: MySQL, 2nd Ed., by: DuBois, P., Sams Press, March 2005. See: JBoss 4.0-The Official Guide, by: The JBoss Group, Sams Press. April 2005.
A complex collection of applications may often have conflicting pre-requisites. Each application may require its own version of the JVM, for example, or depend on specific patch-levels of certain dependent components. VMMs can help tame such conflicts by allowing each application's dependencies to be embedded in its private VM image. Vendors deal with dependency conflicts in more or less the same way. Vendors try to reduce dependency conflicts by embedding the application's dependencies into the application installed image, usually without the benefit of VM technology. For example, Eclipse version 2.x comes bundled with Tomcat, which is used for rendering the Eclipse help pages. Similarly, JBOSS distributions also include an embedded version of Tomcat. Many commercial Java middleware products embed one or more JVMs in their images. This trend has also be reflected within a single software product. For example, the module org.apache.xerces is often duplicated in several different components in an effort to isolate these components more fully from one another. A VMM adds is a kind of guarantee that the isolation between conflicting software stacks is provably complete, lacking in subtle holes.
But, whether assisted by a VMM or not, incorporation of dependencies without any compensating measures results in increasing software bloat. From a disk space perspective, tolerating bloat is no longer a relatively big problem in the art. But an isolation strategy accomplished through physical code duplication creates other problems. It can slow down the deployment process, and increase the number of components that need to be configured at deployment time, or touched during subsequent updates. It may also increase the customer's perception of an application's complexity, which in turn increases customers' reluctance to update frequently. This can result in a proliferation of software versions in the field and increasing support and services costs over time.
Also, data center environments are increasingly moving toward a scale-out model where large farms consisting of several thousand commodity servers are becoming commonplace. In such scenarios, hardware failures can occur frequently, often several times a day. The cost of commodity hardware is relatively low so operators can often deal with hardware failures by simply replacing the defective machine on a rack, and re-provisioning the new machine with the application suite. Large commercial software stacks can take hours to provision, thus increasing the cost of such failures.
Using any VMM to help with provisioning can speed this up by replacing the normal installation process with an easily-moved image. But, unless specific steps are taken to deal with the underlying code bloat, just the process of moving the bits may cause slowdown. Reversing the trend toward increasing code bloat due to duplication-based isolation techniques might prove valuable in such situations. A properly engineered solution may also take into account that a software application can usually begin executing when only a fraction of its bits are present.
A software deployment system assumes that the software it deploys in one offering is not the only software offering deployed on the target machine. Each machine owner assembles a palette of offerings that suits his or her needs. These offerings must be able to inter-operate both via system-mediated communication channels (e.g., semaphores, pipes, shared memory) and via files in a common file system.
Consider the implications for a VMM-assisted deployment. If all offerings were run in the same VM instance, the isolation advantages of using a VM will be lost since the offerings might then conflict. But, if each offering is run in a different VM instance using the usual hardware virtualization paradigm, the inter-operation between offerings takes on characteristics of inter-machine communication rather than intra-machine communication. What seems like one machine to the user is now laced with remote file mounts and distributed protocols. Somehow, the degree of isolation must be relaxed to permit a more local style of inter-operation. The relaxation must be done while still managing conflicts and reducing variability in the areas that matter to correct execution.
Making this change involves tradeoffs. A more porous isolation between VMs enhances the user experience when integrating software on a single machine. However, other characteristics that one might expect from a general-purpose VMM (such as crash protection or the ability to freeze and migrate processes) might be sacrificed.
A spectrum of virtual machines are in use today. These range from runtime environments for high-level languages like Java. See: The Java virtual machine specification, 2nd Ed., by: Lindholm, T., Yellin, F., Addison-Wesley, Reading, Mass., 2000, and Smalltalk. See: Smalltalk-80: the language and its implementation, by: Goldberg, A., Robson, D., Addison-Wesley Longman Publishing Co., Inc., Boston, Mass., 1983, to hardware-level VMMs such as VMware, by: VMWare, Inc., and Xen. See: XEN and the Art of Virtualization, by: Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., Neugebauer, R., Pratt, I., Warfield, A., Proceedings of the 19th ACM Symposium on Operating System Principles, October 2003.
The level of indirection provided by the VM layer enables the software running above it to be decoupled from the system beneath it. This decoupling enables the VM layer to control or enhance the software running above it. VMM, such as VMware, seek to exploit the decoupling to fully isolate the software stack running above it from the host environment thus enabling sandboxed environments for testing, archival, and security. The VMM is often used to capture both the persistent and volatile state of a sandboxed environment to enable mobility of end-user environments over a network. Further, the VMM has been exploited for simplifying the deployment and maintenance of software environments. Utilities like Debian simplify the maintenance of software packages but do not provide isolation in the sense of enabling conflicting versions of a component to co-exist in the same (virtual) namespace.
Managed container frameworks like J2EE and .NET [http://www.microsoft.com/] provide network deployment and management features, but they are language specific, and require the use of framework APIs. Other language-specific solutions for software deployment and maintenance are Java Web Start and OSGi. Zap is an implementation of a virtualization layer between the operating system and the software. One of the objectives of Zap is migration of process groups across machines, not software deployment and serviceability. See: The Design and Implementation of Zap: A System for Migrating Computing Environments, by: Osman, S., Intravenous, D., Su, G., Nieh, J., ACM SIGOPS Operating System Review, Vol 36, Issue SI, December 2002. Others, such as AppStream [http://www.appstream.com/], Endeavors, and Softricity, use file-system based approaches to provide centrally managed software deployment and maintenance solutions for Windows desktops. Desktop applications are generally self-contained applications whose non-OS dependencies are easily be bundled within a single file system mount point, or self-contained directory.
There exists a need to overcome the problems discussed above, and, more particularly, a need to overcome the inefficiencies associated with deploying, updating and versioning software in a network system.