It is often useful to copy computer hard disk partitions. This may be done to create archive copies, for instance, or to configure additional storage devices using one device as a model. Backup programs are available to copy every file in a partition, every file in a group of partitions, or every sector (used or not) on a source disk to a target device such as a secondary disk or a tape drive. The target device may be located either on the same computer or on a connected computer such as a file server. By restoring data from the backup to a different computer, one can use the backup program as an "imaging" or "cloning" program to copy entire disk images from one computer to another. However, backup programs are not usually designed to copy a disk image to several machines at the same time.
As discussed below, the ability to efficiently create many copies of an image is important to system integrators, training centers, and other businesses. To help address this need, various imaging programs are now available which copy a source disk drive to multiple target drives subject to constraints that are discussed below and elsewhere. Generally, imaging programs allow one or more partitions to be copied. Some imaging programs create a copy of every file found in the specified source partition(s), while other programs allow the user to select which files to copy.
Some imaging programs store the copied data in an "image file," while others perform imaging directly to create new disk images without using an intermediate image file. The data may be compressed prior to being placed in the image file, and then decompressed while creating a new disk image. Compression operates on the data being stored to reduce its redundancy and hence allow it to be stored in a smaller space.
Regardless of whether the data is compressed, an image file packs and/or modifies or supplements the data in some fashion, putting it in a form which is more convenient or efficient for storage and transport. "Working" partitions contain files organized according to a standard file system format, such as the format used by NTFS, FAT, HPFS, Linux, or another familiar file system. An image file may be stored in a working partition as a file.
However, when viewed as a collection of files, image file contents do not necessarily follow standard file system formats. An image file may contain partial or complete contents from one or more other files from one or more partitions; for convenience, these are referred to here as "imaged files." The imaged files are not necessarily stored in the image file in a standard file system format. Standard file system software (as opposed to disk imaging software) can read the contents of an image file but cannot properly distinguish between the individual imaged files. A working partition generally follows one or more rules such as allocation in sector units, alignment of sectors on cluster boundaries, specific directory or file allocation table formats, and support for fragmented files. Image file contents may violate one or more of these rules with regard to the imaged files.
One type of image file, which is used by imaging programs that proceed on a file-by-file basis, stores the contents of each imaged file contiguously (thereby providing no support for fragmentation). One variation also ignores cluster alignment by packing together the sectors of imaged files. Another variation goes even further by sometimes ignoring sector boundaries; this allows the imaging software to pack the bytes of more than one imaged file in a given sector when the end of an imaged file lies within a sector. To create a working partition on a target disk, sector allocation and cluster alignment must be restored when the imaged file contents are copied to the target disk.
Another type of image file, which is used by imaging programs that proceed on a sector-by-sector or cluster-by-cluster basis, packs together the clusters or sectors of data. As a result, the cluster numbers or other pointers in file system structures in the image file do not always point to the current (packed) location of the data clusters in question. The clusters must be unpacked and restored to their expected relative locations when data from the image file is copied to the target disk to create a working partition there.
Imaging programs and image files have many uses for system integrators, training centers, testers, and others. Training centers use imaging and image files to recreate disk environments suited to particular lessons. For instance, one lesson may require a Microsoft Windows NT environment, while another requires a Linux environment and a third requires a Novell NetWare environment. (WINDOWS NT is a mark of Microsoft Corporation; NETWARE is a mark of Novell, Inc.) Testers also use imaging and image files. For instance, a maker of peripheral equipment may use different restored images to test its devices and device drivers in different operating environments.
Because the experiences of system integrators illustrate many aspects of the current state of imaging multiple computers, we now consider in greater detail the working environment of an integrator. A system integrator is in the business of reselling computers to companies or corporations. When a large corporation places an order it usually standardizes on a configuration and purchases a large quantity, both to get a better price and to meet the needs of many users within the company. The integrator's job is to order in all the computer hardware and software, configure it all, and then ship it to the customer. This involves ordering in the computers, peripheral hardware, and software, and then installing or assembling all these pieces and shipping the configured computers to the customer. This may be done in different ways, but an important goal is to make the imaging process more efficient and less costly in terms of technician time and other resources.
According to an approach we shall call Approach One, a technician installs each software package on each of the computers using the same process as an end user who installs software infrequently. The technician must stay at the target computer to answer the install software's questions. The target computer reads the data from a floppy or a CD-ROM, both of which are relatively slow devices. There is little or no opportunity to make the installation proceed on more than one computer at a time.
Under Approach Two, the technician configures the first computer and then saves an image of the configured disk on a CD-ROM. Rather than re-install each desired software package on each subsequent computer, the technician need only copy the disk image from the CD-ROM to the disk drive of the next machine and everything on the target machine should then work as desired. This is a great improvement over having the technician sit and re-install all the software on each machine. But one drawback is that, in order to get several of these installations happening at one time, one needs several CD-ROMs, tape cartridges, or other high capacity removable media (one for each target machine).
Under Approach Three, the integrator puts the disk image out on a network and has multiple computers downloading the image from the server at the same time. The technician can move from workstation to workstation opening packages and setting up, starting the process of connecting to the server and copying the disk image from the network, and then moving on to the next workstation. The technician can also move from machine to machine to provide overlap while disconnecting machines from the server and shutting down the machines before boxing them for shipment. However, each machine has a separate conversation with the server. Thus, if there are too many computers downloading at once, the speed of the network can become a serious bottleneck because the multiple conversations consume the available bandwidth.
Approach Four overcomes the network bottleneck created when many workstations individually download the disk image. This fourth approach has one workstation request the image and allows multiple workstations to "listen" to that conversation. Each listening workstation makes its own copy as the image goes over the network from the server to the sole requesting workstation. This significantly reduces the network traffic.
However, in order for several workstations to listen to the one conversation using a broadcast technology or a multi-cast technology, all of the computers must be previously configured and each must be waiting and watching for the beginning of the download. Also, they all finish at the same time, so it is not possible for the technician to overlap the process of shutting down and boxing the workstations.
Accordingly, it would be an improvement to provide new systems, devices, and methods for transferring a disk image to multiple machines in a way that improves the opportunity for overlap without requiring additional cartridges or network bandwidth.
In particular, it would be useful to improve overlap between machines both in the process of downloading a copy of the image from a network and in the processes of setting up and shutting down machines before and after a download.
Such improvements are disclosed and claimed below.