Random access storage devices, such as magnetic disks, optical disks, flash memory, and the like, are typically subdivided into discrete storage blocks. Each such block has a predefined storage capacity that is smaller than the overall capacity of the storage device.
The storage blocks of a magnetic disk are often referred to as “sectors.” Groups of sectors are sometimes called “clusters.” Data files are recorded on storage devices using these sectors or clusters as elementary building blocks. A data file is accessed by identifying the storage blocks that have been assigned to that data file and reading from, or writing to, the identified storage blocks as necessary. A “file system” is a set of data files stored in a predetermined way on a storage device.
In accessing a specified portion of a file for purposes of reading data from and/or writing data to that file portion, the system must identify the specific physical location of the one or more corresponding storage blocks either directly or indirectly. Signals representing the identified physical locations are generated and then provided, for instance, to a read/write head actuator of a magnetic disk drive.
Normally, a predefined portion of the storage device is set aside for storing organizational information, such as file names and hierarchical directory or folder structures. This organizational information identifies the specific blocks that form each data file. It also indicates the logical sequence or other logical organization of those file-forming blocks. The set-aside portion of the storage device is often referred to as the “directory” area.
Within this set-aside directory area there may exist additional control data structures that may also reside on the storage device. Some common names for these data structures are: file allocation table (FAT), index node (inode), and master file table (MFT). Typically, these control data structures identify which of the allocation units are free from defect such that they may be safely used for data storage. These control data structures will also indicate which of the defect-free allocation units are presently available for storing new data. Allocation units that store data of an existing file are designated as “not-free” or “used.” Because each allocation unit can store a maximum amount of data, such as 1024 bytes (1K) or 4096 bytes (4K) per allocation unit, when a file is created for storing more than one-allocation unit's worth of data, the files are subdivided and distributed across a plurality of allocation units.
The directory area may also include a data structure sometimes referred to as a volume catalog. The volume catalog is established on the storage device for associating a file name with the stored data of the file. The volume catalog is also used for storing starting location data, such as track and sector numbers, that point to the physical location on the storage device at which the file starts. The volume catalog also usually stores file size data for indicating the size of the file in terms of the number of bytes that the particular file contains and/or ending location data for indicating where the file ends.
When a storage device is relatively new, either at the start of the device's operating life, or just after it has been re-formatted (initialized), the storage device has many regions of logically-contiguous and available storage blocks for receiving and contiguously storing the data of large files. The resulting file structures for an initial set of recordings tend to be logically contiguous or whole.
However, as the storage device begins to fill up, and as old files are deleted and newer files are added, and/or as pre-existing files are modified many times over, the stored files on the disk tend to become “fragmented.” The term, fragmented, is used herein to describe the condition in which the data of a specific file is no longer entirely stored along a logically-sequential series of discrete storage blocks. Rather, the data of a fragmented file is scattered about the storage device in a more random, spaced-apart fashion.
When a file is fragmented, it may be necessary to perform multiple seeks between widely spaced-apart sections of the disk (e.g., jumps between nonadjacent tracks) in order to access the file for reading and/or writing. This may disadvantageously increase file access time. Accordingly, file fragmentation is generally undesirable.
A variety of events may occur during the operational life of a storage device that work to undesirably destroy or corrupt stored data. One of these damaging events may destroy or corrupt data stored in the directory area of the storage device. When this type of event occurs, it may no longer be possible to access certain files on the storage device. Reconstructing a damaged file system can be particularly difficult where the files are highly fragmented.
Individual data files can also be corrupted or destroyed in a way that does not affect the other files or the directory area. Similarly, files can be “destroyed” by being accidentally deleted or modified (e.g., saving an old version of a document over a new version).
Backing up the file system at regular intervals is the primary method for protecting valuable data from loss due to corruption of the directory area, accidental deletion or modification of files, etc. Traditionally, a user would need to selectively back up individual files from a primary storage device, such as a hard drive, to a backup storage device, such as a tape drive or optical drive. However, if the directory area of the primary storage device was subsequently damaged, the user would be forced to reformat the device and reinstall the operating system (OS) and application programs before being able to restore the backed-up files. This was a time-consuming and laborious process.
Disk imaging programs were later developed that allowed a user to copy an exact image of the entire file system, including the OS and all other software and data, to the backup storage device. In the event of file system corruption, the user could simply restore the image to the primary storage device to return the file system to its pre-imaging state.
Some disk imaging programs are able copy an image of a file system to another “partition” within the primary storage device. In IBM-compatible personal computers, hard drives may be divided into partitions, which are subdivisions of allocation units typically used to store a separate file system. For example, each partition has its own directory area, including a control data structure, volume catalog, etc. Accordingly, the partitions may be treated by the OS as separate logical storage devices.
However, reserving a partition for storing backup images reduces the space available to other partitions for storing programs and data. Moreover, each new partition adds overhead due to the additional file system, further reducing the available storage space.