1. Field of the Invention
This invention relates generally to disk drive backup systems and, more particularly, to image backup methods for backing up data from disk partitions of storage devices.
2. Description of the Related Art
Modern computer systems typically include one or more mass storage devices such as hard disk drives, optical disc drives, floppy disk drives, removable disk drives, and the like to store a large amount of information. Often, however, the storage devices fail to operate properly for various electromechanical defects. In the event of such failures, valuable data stored on the storage devices may be lost permanently or may require costly and time consuming repairs to recover the original data.
To guard against such failures, modern computer systems typically employ a backup system to backup data stored on a storage device. FIG. 1 illustrates an exemplary computer system 100 including a host computer 102 and a backup device 104. The backup device 104 is coupled to the host computer 102 by means of a bus 106 for backing up the contents of one or more storage devices (e.g., hard disk drives, optical drives, etc.) in the host computer 102. The backup device 104 then provides the backed up data to the host computer 102 to restore the original data when necessary. For example, data may be restored from the backup device when a backed up hard drive fails or when data on a backed up hard drive become corrupted.
The storage devices such as fixed disk drives (e.g., hard disk drives, removable disk drives, etc.) generally include one or more disks for storing data. For example, conventional hard disk drives include one or more disks that are partitioned into one or more partitions (e.g., volumes, logical drives, etc.), which is well known in the art. Each of the disk partitions is a logically self-contained volume and is typically represented by a drive letter such as xe2x80x9cC,xe2x80x9d xe2x80x9cD,xe2x80x9d xe2x80x9cE,xe2x80x9d or the like. In addition, each partition contains files and directory bit maps such as file allocation table or the like. Typically, a partition is organized as a linear sequence of clusters, each of which is comprised of a number (i.e., set) of sectors.
FIG. 2A illustrates a schematic diagram of an exemplary disk 200 for storing data. The disk 200 is configured to include a plurality of tracks 202. Each of the tracks 202 is divided into sectors 204 for storing data. The disk 200 may be partitioned into one or more partitions with each partition having a file allocation data structure such as a file allocation table.
As is well known in the art, the partitions of a disk are generally organized in sectors. FIG. 2B shows a schematic diagram of an exemplary track 202 divided into sectors 204. A sector may be any size, but is typically 512 bytes in size. In this arrangement, files are configured to be stored in the disk 200 in units of clusters 206. Each of the clusters 206 includes a pair of sectors 204. As is well known in the art, however, a cluster may include any number of number of contiguous sectors typically in powers of two (e.g., 1, 2, 4, 8, 16, etc.).
In general, data in a storage device are backed up using one of two techniques: file-based backup and image-based backup. In the file-based backup method, the contents of individual files are copied from a source disk onto a backup media. The files are usually copied without regard for how they are arranged on the source disk. For example, a partition may have ten sectors containing two files. One file is stored in sectors two through four and sectors eight and nine while the other file is stored in sectors five through seven. The remaining sectors zero and one are unused. In this case, the file-based backup would store information in the backup in the following sequence: sectors two through four, eight and nine, five through seven, such that the unused sectors zero and one are not copied.
The file-based backup method, however, may require a substantial number of non-sequential read and write operations to back up an entire partition since a partition often contains hundreds or even thousands of files. For example, to back up the former file in sectors two through four and sectors eight and nine, a backup system reads sectors two through four first, and then performs a seek to sector eight for reading sectors eight and nine. Such non-sequential read and write operations entail numerous seek operations to proper sectors of clusters.
In contrast, the image-based backup method generally reduces the time required to backup an entire partition. Image-based backup systems are capable of backing up one or more partitions in a disk. In this method, all data on the partition, including valid data, free space, and invalid data, are copied and stored on a backup medium. For example, to perform an image backup of a partition xe2x80x9cC,xe2x80x9d the image-based backup method operates to read and store the data on the partition sequentially from beginning sector to the end. By thus reading and storing the sectors linearly, seek operations are minimized. Hence, the backup time is typically reduced in comparison with the file-based backup technique.
Some examples of conventional backup media are magnetic tapes, magnetic disks, optical disks, etc. In performing image backups, conventional image-based backup methods typically use a backup medium that has a larger data capacity than the source disk to be backed up. For example, a backup medium of at least one GB is commonly used to backup a partition of a one Gigabytes (GB) source disk.
As the size of disks increases in size, however, a backup medium may not be able to store an entire image copy of a partition in a disk. This problem is exacerbated for a backup medium having a standardized data storing capacity. For example, optical disk drives such as CD-ROM recordable and rewritable media typically have a maximum capacity of about 650 Megabytes (MB) in accordance with industry standards. When the capacity of a partition to be backed up exceeds the capacity of individual backup medium, the partition is typically backed up over multiple backup media called volumes (e.g., discs). In this case, the image backup is spanned over multiple files or volumes until the entire partition has been backed up. The full group of volumes that make up the full backup data set is often referred to as a backup set.
Unfortunately, however, conventional spanning backup methods have several drawbacks. For example, the conventional spanning method takes substantial amount of time to backup and restore data when used with relatively slow optical disc drives such as CD-ROM rewritable or recordable drives, which are typically characterized by significantly larger seek times than hard disk drives. Since the backup and restore operations are often performed in a non-sequential manner, the larger seek times of the optical disc drives thereby increase the time needed to perform backup and restore operations.
In addition, some conventional backup media such as CD-ROM recordable discs are configured to be written only once. For example, once a data has been recorded on a write-once medium, no data can be written over the recorded data. That is, data may only be added and not edited. When a part of the data that have been written needs to be changed, the entire file needs to be rewritten. This rewriting of the file data directly translates into substantial cost in disc space and time, thereby degrading backup performance.
In view of the foregoing, what is needed is an image backup method and system for backing up data of one or more partitions to support spanning over multiple volumes while optimizing for sequential writing and reading to and from the back up media to save storage space and backup performance.
The present invention fills these needs by providing a method and system for backing up data over a plurality of volumes. It should be appreciated that the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device, a method, or a computer readable medium. Several inventive embodiments of the present invention are described below.
In accordance with one aspect of the invention, the present invention provides a method for backing up image data from one or more partitions of a storage device onto one or more backup media. Each backup medium defines backup volume having a predetermined storage capacity with each partition having a plurality of sectors. The method includes: (a) reading the sectors of a selected partition of the storage device for backup in the one or more backup volumes, wherein a set of the sectors read from the selected partition defines a data chunk for processing the sectors as data chunks; (b) sequentially storing a set of the data chunks in the order read from the partition in a selected backup volume; (c) generating and storing data chunk descriptors configured to reference the stored data chunks in the volume, one data chunk descriptor per data chunk, the data chunk descriptors being stored in the selected backup volume after storing all of the set of data chunks; and (d) generating and storing address data descriptors configured to reference at least one of the stored data chunks and at least one of the data chunk descriptors in the selected backup volume, the address data descriptors being stored in the selected backup volume after storing the data chunk descriptors.
In accordance with another aspect of the present invention, a computer system is provided for backing up data from one or more partitions of a storage device onto one or more backup media. Each partition in the storage device has a plurality of sectors and each backup medium defines a backup volume having a predetermined storage capacity. The system includes a processor coupled to a bus, a random access memory unit coupled to the bus, and a storage device coupled to the bus. The storage device is configured to read a plurality of sectors in a selected partition. A set of sectors defines a data chunk such that the selected partition is processed as one or more data chunks. The computer system also includes means for sequentially storing a set of the data chunks in the order read from the partition in a selected backup volume and means for generating and storing a set of data chunk descriptors for referencing the stored data chunks in the selected backup volume. One data chunk descriptor is provided for each data chunk and the data chunk descriptors are stored in the selected backup volume after storing all of the set of data chunks. In addition, the computer system includes means for generating and storing address data descriptors for referencing at least one of the stored data chunks and at least one of the data chunk descriptors in the selected backup volume. The address data descriptors are stored in the selected backup volume after storing the data chunk descriptors.
In accordance with yet another aspect of the present invention, a method is provided for a computer readable medium. The computer readable medium is adapted to store computer executable instructions for providing data read from a storage device for storage in one or more backup volumes. The storage device has one or more partitions, each of which has a plurality of sectors. The computer executable instructions are suited for: (a) reading a plurality of sectors in a selected partition, wherein a set of sectors defines a data chunk such that the selected partition is read as one or more data chunks; (b) sequentially storing a set of the data chunks in the order read from the partition in a selected backup volume; (c) generating and storing a set of data chunk descriptors for referencing the stored data chunks in the selected backup volume, one data chunk descriptor per data chunk, the data chunk descriptors being stored in the selected backup volume after storing all of the set of data chunks; and (d) generating and storing address data descriptors for referencing at least one of the stored data chunks and at least one of the data chunk descriptors in the selected backup volume, the address data descriptors being stored in the selected backup volume after storing the data chunk descriptors.
The present invention advantageously provides an image backup method that supports spanning over multiple backup volumes. In addition, the backup method is optimized to store the data sectors in the order that they are read from a storage device to reduce seek operations over the backup media. Furthermore, the data sectors can also be accessed in the order that they appear on the original storage device to minimize backup media swapping and seek operations. Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.