The present invention relates to storing and recovering computer disk images in a computer partition. More particularly, the invention provides tools and techniques for placing images in the same partition that is being imaged, and for extracting information from images stored in the imaged partition, thereby allowing single large partitions to be used more effectively.
Computer hard disks and other computer storage devices hold digital data which represents numbers, names, dates, text, pictures, sounds and other information used by businesses, individuals, government agencies, and others. To help organize the data, and for technical reasons, many computers divide the data into drives, partitions, directories, and files. The terms xe2x80x9cfilexe2x80x9d and xe2x80x9cdirectoryxe2x80x9d are familiar to most computer users, and most people agree on their meaning even though the details of written definitions vary.
However, the terms xe2x80x9cpartitionxe2x80x9d and xe2x80x9cdrivexe2x80x9d have different meanings even when the context is limited to computers. According to some definitions, a partition is necessarily limited to one storage device, but a xe2x80x9cfile systemxe2x80x9d may include one or more partitions, on one or more disks. Many partitions reside on a single disk, but some approaches, such as volume sets, stripe sets, mirror sets, and others, store a single partition""s data on more than one disk.
As used here, a xe2x80x9cpartitionxe2x80x9d is a region on one or more storage devices which is (or can be) formatted to contain one or more files or directories. A partition may be empty. A partition may also be in active use even without any directories, file allocation tables, bitmaps, or similar file system structures if it holds a stream or block of raw data. Each formatted partition is tailored to a particular type of file system, such as the Macintosh file system, SunOS file system (a variant of the UNIX file system), Linux file system (EXT2fs, a variant of the UNIX file system), Windows NT File System (xe2x80x9cNTFSxe2x80x9d), NetWare file system, Linux file system, or one of the MS-DOS/FAT file systems. (MACINTOSH is a trademark of Apple Computer, Inc.; SunOS is a trademark of Sun Microsystems, Inc.; WINDOWS NT and MS-DOS are trademarks of Microsoft Corporation; NETWARE is a trademark of Novell, Inc.; LINUX is a mark of Linus Torvalds).
Computers utilize a wide variety of storage devices as storage media for user data. Storage technologies currently provide removable optical, and magnetic disks, fixed and removable hard disks, floppy disks, solid state storage devices, and new storage technologies are continually being actively researched and developed. Indeed, some storage devices used by computers in the future may be cubical or some other shape with no moving parts rather than flat and circular, and in addition, storage devices which use computer chips as storage media are being developed. Disks, storage devices and related concepts such as cylinders, sectors, platters, tracks, heads, physical sector addresses, and logical sector addresses are generally familiar in the art. For instance, they are discussed in U.S. Pat. Nos. 5,675,769 and 5,706,472 assigned to PowerQuest Corporation, and those discussions are incorporated herein by this reference.
An operating system manages access, not only to the disks, but to other computer resources as well. Resources typically managed by the operating system include one or more disks and disk drives, memory (RAM and/or ROM), microprocessors, and I/O devices such as a keyboard, mouse, screen, printer, tape drive, modem, serial port, parallel port, or network port.
Many disks mold the available space into one or more partitions by using a partition table located on the disk. A wide variety of partition types are used, and more partition types will no doubt be defined over time. A partial list of current partitions and their associated file systems is given in U.S. patent application Ser. No. 08/834,004 and incorporated here by reference. The list includes a variety of 12-bit, 16-bit, and 32-bit FAT file systems and numerous other file systems. Tools and techniques for manipulating FAT and certain other partitions are described in U.S. Pat. Nos. 5,675,769 and 5,706,472 assigned to PowerQuest Corporation, incorporated herein by this reference.
One partition table composition, denoted herein as the xe2x80x9cIBM-compatiblexe2x80x9d partition table, is found on the disks used in many IBM(copyright) personal computers and IBM-compatible computers (IBM is a registered trademark of International Business Machines Corporation). Although IBM is not the only present source of personal computers, server computers, and computer operating systems and/or file system software, the term xe2x80x9cIBM-compatiblexe2x80x9d is widely used in the industry to distinguish certain computer systems from other computer systems such as Macintosh computer systems produced by Apple Computer (Macintosh is a market of Apple Computer) and UNIX computer systems. IBM-compatible partition tables may be used on a wide variety of disks, with a variety of partition and file system types, in a variety of ways.
As shown in U.S. Pat. Nos. 5,675,769 and 5,706,472, one version of an IBM-compatible partition table includes an Initial Program Loader (xe2x80x9cIPLxe2x80x9d) identifier, four primary partition identifiers, and a boot identifier. As also shown in those patents, each partition identifier includes a boot indicator to indicate whether the partition in question is bootable. At most one of the partitions in the set of partitions defined by the partition table is bootable at any given time.
Each partition identifier also includes a starting address, which is the physical sector address of the first sector in the partition in question, and an ending address, which is the physical sector address of the last sector in the partition. A sector count holds the total number of disk sectors in the partition. A boot sector address holds the logical sector address corresponding to the physical starting address.
Some IBM-compatible computer systems allow xe2x80x9clogical partitionsxe2x80x9d as well as the primary partitions just described. All logical partitions are contained within one primary partition; a primary partition which contains logical partitions is also known as an xe2x80x9cextended partition.xe2x80x9d
Each partition identifier also includes a system indicator. The system indicator identifies the type of file system contained in the partition, which in turn defines the physical arrangement of data that is stored in the partition on the disk. Values not recognized by a particular operating system are treated as designating an unknown file system. The file system associated with a specific partition of the disk determines the format in which data is stored in the partition, namely, the physical arrangement of user data and of file system structures in the portion of the disk that is delimited by the starting address and the ending address of the partition in question. At any given time, each partition thus contains at most one type of file system.
Many computers are sold with operating systems, application programs, and other data already loaded on the disk. Manufacturers and vendors of computers often would like to provide users with a backup or image of the information they originally loaded on a hard drive. Two basic approaches are used in conventional systems and methods to backup computer data. One approach is generally file-oriented, while the other approach deals with files but operates primarily on clusters, sectors, runs, or similar logical allocation units which are smaller than files.
A file-oriented backup approach is illustrated in FIG. 1. A partition 100 includes system data 102 and user data 104. The system data 102 includes file system data such as sector or cluster allocation maps or tables and directories. The system data 102 also includes operating system data such as partition tables and boot code. The user data 104 includes data created by users, such as word processor or spreadsheet files, as well as application programs, dynamic libraries, and other data which is loaded by the vendor or system integrator and organized in the partition by the file system structures. As shown, this backup approach copies the user data 104 to a backup medium 106, such as a ZIP disk (mark of lomega), a tape drive, a writable CD, a WORM drive, or a collection of floppy disks.
With such a file-by-file backup, each file is backed up separately, and can be recovered separately. This can be advantageous. However, file-oriented approaches also have some disadvantages. File-by-file backup programs access the user data 104 through standard operating system and/or file system routines, and they require that the operating system and file system software be reinstalled prior to system recovery. They may miss important files such as registry or system configuration files, and they do not back up data 104 from deleted files even if the sector(s) holding the data have not been overwritten. In addition, a single file may be stored in a series of clusters at locations scattered across the disk. To restore such a file, the disk head must be randomly positioned multiple times across the platter, which increases restoration time and increases the chance of a disk head crash.
FIG. 2 shows an imaging approach which also restores files but deals primarily in clusters or another file allocation unit which is typically smaller than a file. Unlike the file-oriented backup shown in FIG. 1, the imaging backup approach shown in FIG. 2 copies the entire disk state. An image may be created on the backup medium 106 by reading and writing each sector, in order, in one or more partitions 100 of a disk. Usually unallocated sectors are skipped.
This imaging approach can backup all data 102, 104, including data in deleted files when that data has not been overwritten, file system structures, operating system files, device drivers, information about network cards and other installed hardware, application programs, user-created files, hidden files, and all other data 102, 104 stored in the selected partition(s) 100. Some imaging approaches also copy partition table information to the backup medium 106. When a full disk image is restored, every byte of the original disk is restored, including all system and user data, including disk partitions, operating systems information, user files, and boot sector data. A sector-by-sector image preserves optimizations, producing an exact image of the disk, with the exception that some images do not contain data from unallocated sectors.
The imaging approach facilitates sequential head moves across the disk platters in so-called xe2x80x9celevator seeksxe2x80x9d, thereby decreasing both the time needed to backup or restore entire partitions and/or disks, and decreasing the chance of a head crash. Imaging of the type shown in FIG. 2 can be performed using the Drive Image product which is commercially available from PowerQuest Corporation of Orem, Utah (DRIVE IMAGE is a registered trademark of PowerQuest).
With either the file-oriented approach shown in FIG. 1 or the sector imaging approach shown in FIG. 2, the backup medium 106 may be a disk containing a target partition other than the partition 100. The target partition may or may not be the partition 100; the partition 100 and the target partition may be on the same disk, or they may be on two disks on the same computer. The source and target computers may also be connected by a network link, as when the target partition is directly attached to a network server to receive backup images of partitions 100 on clients of the server.
One backup method according to FIG. 2 involves two partitions on a drive. The first partition is the source partition 100, which contains all the user programs and data 104, while the target partition is separate partition 106 on the same drive; the partition 106 often contains little or nothing more than an image of the first partition 100. For example, a 10 GB hard drive might contain two partitions, namely, an 8 GB partition 100 with the system files and pre-installed software and a 2 GB partition 106 that contains a disk image of the partition 100.
However, manufacturers are sometimes reluctant to divide disk drives into more than one partition, because some computer purchasers equate the size of their main partition (for instance, the so-called xe2x80x9cC: drivexe2x80x9d on many IBM-compatible computers) with the size of the entire disk. If the primary partition on a new disk drive is substantially smaller than the advertised disk size, purchasers may conclude that the disk drive itself is smaller than they requested. In the example above, a user might erroneously conclude that the computer came with an 8 GB drive rather than the expected 10 GB drive, because the bootable partition 100 contains only 8 GB. This mistaken but understandable conclusion leads to consumer dissatisfaction and increases the vendor""s support costs.
Another problem facing the computer user is how to acquire a fully functional backup of both system and user data. Many critical system files, such as the registry files which contain critical configuration information, are open when a computer is running in the Microsoft Windows 95, Windows 98, and Windows NT operating systems. Even if an approach like that shown in FIG. 2 is used, these open files cannot be successfully saved by standard backup software. If a computer""s hard disk crashes and all files must be rebuilt, some user files 104 can be restored. But the operating system, device drivers, and perhaps even the backup software itself, all must be reinstalled from some source other than the image 106. Data files that were open when the backup was made also would not be restored from the image 106.
Accordingly, it would be an advancement in the art to provide improved data backup tools and techniques, including tools and techniques for avoiding consumer confusion about disk size while still providing backup images.
Such improved tools and techniques are disclosed and claimed herein.
The present invention provides tools and techniques for storing and retrieving data images of a partition within the imaged partition. As used here, xe2x80x9cin-partition imagesxe2x80x9d are images of a partition stored within the imaged partition. An image created in the factory before delivery to the user (a factory image) as well as one or more user-updateable images can be stored in the same partition. The in-partition images themselves may be compressed, or not compressed, packed or not packed, and/or encrypted or unencrypted. The in-partition images may be stored as one or more files within the file system, or as an image container. If the image file would be larger than the maximum file size allowed for a particular operating system, (often 2 GB) the image may be divided into multiple files that together make up all or part of the container. The image may also be divided into multiple files to facilitate later transfer to multiple smaller storage media, such as writable removable media. To speed restoration time and to assist recovery, the image may be stored contiguously at or near the end of the partition, but is not restricted to either being contiguous or at the end of the partition. For improved efficiency, the image file or image container can be stored in a separate subdirectory of the imaged partition.
In one embodiment, creation of an image within a partition creates an exact copy of the entire partition, including deleted but not overwritten files. Each sector of the partition, in order, is read into the image. The image must be created when the computer has been put into a state that allows exclusive disk access. This prevents inconsistencies in the data and helps ensure that system files such as the Microsoft Windows registry are closed and thus can be imaged. When the image is made of the partition, the image itself is not imaged. However, user images may be incrementally updated.
If more than one image is stored on a single partition, a user can choose which image should be used to restore the partition. If the disk or its partition is damaged, it may still be possible to recover the imaged data. Copies of a portion of the partition data and/or the system data sufficient to recover the imaged partition can be stored at a specified location within the imaged partition, within the image container, in a separate diagnostic and recovery partition, and/or on a removable recovery medium such as a ZIP drive, a floppy disk, and so on. Which system files or other data should be saved depends both on the operating system involved and the nature of the image. Using the saved system data, the image can then be located on the partition and restored. The image files and/or image container may also contain unique signature bytes to allow them to be detected by scanning the storage medium. In this way, if the disk or partition is damaged, the image may be discovered and used to restore the partition.
In one embodiment the file system data is verified when it is used, such as before an image is created or updated, after an image is created or updated, and when system data is stored in a separate location such as in a recovery disk or in a diagnostic and recovery partition. The consistency and integrity of the image itself is also verified when used, such as after it is created or updated, and before and after it has been used to restore user data. This can be performed by way of check codes such as checksums or CRC codes embedded in the image files and/or the image container.
The image can be restored to a number of locations, including target locations inside the same partition that contains the image, another partition on the same machine, another partition on a physically different machine (such as over a network connection), or onto a removable medium. One or more files from the image can be individually restored without restoring the entire image. Other features and advantages of the present invention will become more fully apparent through the following description.