Software images include a large amount of data. For instance, operating systems for personal computers and servers, virtual machines, office suite software, and health care management software take up a great deal of storage space on a computer readable medium. These are just a few examples. Also, multiple images have traditionally been copied onto a single computer-readable media. Often times these multiple images differ in only certain respects. The result is that the majority of the data in those multiple images is common, which results in redundancies across images on the same media.
In response to this problem, methods have been developed to reduce data duplication in imaging by consolidating multiple individual software programs (images) into a single operational, combined image file from which each of the individual programs can be recreated. Such methods are disclosed in U.S. Pat. No. 7,017,144 to Cohen et al. (“Cohen”) and the March 2007 document entitled “Windows Imaging File Format” (“Windows Format”), which is cited on an Information Disclosure Statement and attached. Both of these documents are hereby incorporated by reference in their entirety. For example, the Cohen and Windows Format methods may allow for huge operating systems to fit on DVD's. However, there are still problems associated with the conventional imaging format.
The contents of various software packages are frequently updated or supplemented in various manners. For example, operating systems are often updated with various fixes to correct security problems or other software bugs. Another example is that operating systems or virtual machines may be updated with new hardware support or device drivers. With existing image technology, there is no way of imaging these updates rather than to create an entirely new image that includes the updates. As a result a software fix may pose storage and distribution problems if a previously-created image becomes obsolete after a software update. For example, even if multiple versions of the same software have been stored on a single file such that redundant space is eliminated, the single combined file only accounts for the updates that have already been made to the software. The file will become obsolete when the software is updated again, and a new file will need to be created. Thus, the benefits of single-instancing, or reducing data duplication in the creation of a software image, are reduced when a new set of common files is recreated following the receipt of a software update.
There are certain specific limitations of the traditional imaging format methods with respect to software updates.
Monolithic Image File:
Since a standard Windows Imaging Format file (“WIM”) is represented as a monolithic file, all images must be appended to this file in order to gain the benefits of single-instancing. When storing only a few images based on the same operating system, it is not uncommon to exceed several gigabytes quickly. Even with today's server and network technologies, replicating large, monolithic files of several gigabytes is not recommended (and sometimes not reliable) in an enterprise environment. Also, caching of these files may not be possible due to their size, so server performance may be impacted when deploying the image to multiple clients.
Existing Spanned Imaging Format:
WIM files can be split into multiple parts of a given size in order to fit on smaller forms of media such as CD/DVD discs. However based on the existing split WIM specification, all parts of the set must be present before an image can be applied. This means split WIM parts that reside on a multiple disk set must be copied to a temporary location before being applied to a computer. For example, on the 5 CD set of Windows Vista Ultimate, each split WIM part is copied to a temporary folder at the root of the system volume before the image is actually applied. This results in longer image deployment times caused by the additional disk i/o copying WIM parts vs. directly applying their contents. This also results in additional free space being required on the target computer, as well as file fragmentation when the WIM parts are removed at setup completion.
Existing Resource-only and Metadata-only WIM:
In the current implementation of resource and metadata-only WIM files, the resource-only WIM (RWM) file stores all file resource data whether common to all images (single-instanced) or unique to one. The resource-only file remains read/write because it must be read during an image apply operation, and be written during a capture. This means that if a resource-only WIM was placed on read-only media, such as CD/DVD disc, no additional images could be captured leveraging that resource-only WIM. Metadata-only WIM files carry only the instructions on how to recreate a single volume image using the resources found in the resource-only WIM. They do not store nor describe what files are unique to a particular image, so replicating or deploying only the differences between one set of images and another is not possible with these formats.
For these reasons, a method and system for modularizing image formats is desired to address one or more of these and other disadvantages. Additionally, a method and system are generally needed to address the lack of flexibility to updates and general lack of portability of software images. The following specification discloses methods and systems for storing, distributing, and updating software images.