The invention relates generally to the field of computer systems and more particularly provides a system and method for reconfiguring storage devices of a computer system into logical units of storage space on one or more on-line disk drives, typically while the system is in real-time operation.
A computer system includes an operating system whose primary function is the management of hardware and software resources in the computer system. The operating system handles input/output (I/O) requests from software processes or applications to exchange data with on-line external storage devices in a storage subsystem. The applications address those storage devices in terms of the names of files, which contain the information to be sent to or retrieved from them. A file system, which is a component of the operating system, translates the file names into logical addresses in the storage subsystem. The file system forwards the I/O requests to an I/O subsystem, which, in turn, converts the logical addresses into physical locations in the storage devices and commands the latter devices to engage in the requested storage or retrieval operations.
The on-line storage devices on a computer are configured from one or more disks into logical units of storage space referred to herein as xe2x80x9ccontainers.xe2x80x9d Examples of containers include volume sets, stripe sets, mirror sets, and various Redundant Array of Independent Disk (RAID) implementations. A volume set comprises one or more physical partitions, i.e., collections of blocks of contiguous space on disks, and is composed of space on one or more disks. Data is stored in a volume set by filling all of the volume""s partitions in one disk drive before using volume partitions in another disk drive. A stripe set is a series of partitions on multiple disks, one partition per disk, that is combined into a single logical volume. Data stored in a stripe set is evenly distributed among the disk drives in the stripe set. In its basic configuration, a stripe set is also known as a xe2x80x9cRAID 0xe2x80x9d configuration. A mirror set is composed of volumes on multiple disks, whereby a volume on one disk is a duplicate copy of an equal sized volume on another disk in order to provide data redundancy. A basic configuration for a mirror set is known as xe2x80x9cRAID 1.xe2x80x9d There is often a desire to increase data reliability in a stripe set by using parity distributed across storage blocks with respect to each stripe. Where such parity is provided to the stripe set, the configuration is known as xe2x80x9cRAID 5.xe2x80x9d In an even more complex implementation, where stripe sets are mirrored on a plurality of containersxe2x80x94and parity is distributed across the stripes, the resulting configuration is known as xe2x80x9cRAID 10.xe2x80x9d Generally speaking, all configurations of the RAID implementation (RAID 0-10) provide a collection of partitions, where each partition is composed of space from one disk in order to support data redundancy.
According to a prior system, the I/O subsystem configures the containers through a software entity called a xe2x80x9ccontainer manager.xe2x80x9d Essentially the container manager sets up a mapping structure to efficiently map logical addresses received from the file system to physical addresses on storage devices. The I/O subsystem also includes a software driver for each type of container configuration on the system. These drivers use the mapping structure to derive the physical addresses, which they then pass to the prospective storage devices for storage and retrieval operations.
Specifically, when the computer system is initially organized, the I/O subsystem""s container manager configures the containers and maintains the configuration tables in a container layer of the I/O subsystem. In accordance with a co-pending related U.S. Pat. No. 6,219,693, issued on Apr. 17, 2001, entitled, File Array Storage Architecture by Richard Napolitano et al., the container layer of he I/O subsystem comprises a Device Switch Table, a Container Array, and a Partition Table. The teachings of this application are expressly incorporated herein by reference. The Device Switch table consists of entries, each of which ordinarily points to the entry point of a container driver that performs I/O operations on a particular type of container. The Container Array is a table of entries, each of which ordinarily points to data structures used by a container driver. There is a fixed one-to-one relationship between the Device Switch Table and the Container Array. The Partition Table contains partition structures copied from disk drives for each container on the system. Each Partition Table entry points to one physical disk drive and allows the container driver to access physical location in the on-line storage devices.
When a software process issues an I/O request, the file system accepts the file-oriented I/O request and translates it into an I/O request bound for a particular device. The file system sends the I/O request which includes, inter alia, a block number for the first block of data requested by the application and also a pointer to a Device Switch Table entry which points to a container driver for the container where the requested data is stored. The container driver accesses the Container Array entry for pointers to the data structures used in that container and to Partition Table entries for that container. Based on the information in the data structures, the container driver also accesses Partition Table entries to obtain the starting physical locations of the container on the storage devices. Based on the structures pointed to by the Container Array entry and partition structures in the Partition Table, the container driver sends the I/O request to the appropriate disk drivers for access to the disk drives.
In prior systems, the containers are configured during the initial computer setup and can not be reconfigured during I/O processing without corrupting currently processing I/O requests. As storage needs on a computer system change, the system administrators may need to reconfigure containers to add disks to them or remove disks from them, partition disks drives to form new containers, and/or increase the size of existing containers. If containers are reconfigured during I/O processing in the I/O subsystem, the reconfiguration may corrupt or erase the currently processing I/O requests. However, shutting down the system to reconfigure containers may be unacceptable for businesses that require high availability, i.e., twenty-four hours/seven days a week on-line activity.
One aspect of the system described herein is to provide a method of routing processing I/O requests in the I/O subsystem to a different container than previously pointed to by the file system. On-line storage devices are configured from on one or more disks into logical units of storage space referred to herein as xe2x80x9ccontainers.xe2x80x9d Containers are created and maintained by a software entity called the xe2x80x9ccontainer manager.xe2x80x9d Each type of container on the system has an associated driver, which processes system requests on that type of container. After a complete backup operation, the backup program verifies the backed up files to make sure that the files on the secondary storage device (usually a tape) were correctly backed up. One problem with the backup process is that files may change during the backup operation.
To avoid backing up files modified during the backup process and to enable applications to access files during the backup operation, the container manager periodically (e.g. once a day) performs a procedure that takes a xe2x80x9csnapshotxe2x80x9d or copy of each read-write container whereby, the container manager creates a read-only container which looks like a copy of the data in the read-write container at a particular instant in time. Thereafter, the container manager performs a xe2x80x9ccopy-on-writexe2x80x9d procedure where an unmodified copy of data in the read-write container is copied to a read-only backup container every time there is a request to modify data in the read-write container. The container manager uses the copy-on-write method to maintain the snapshot and to enable backup processes to access and back up an unchanging, read-only copy of the on-line data at the instant the snapshot was created. This procedure is described in detail in related co-pending U.S. Pat. No. 6,061,770, issued on May 9, 2000, entitled Copy-on-Write with Compaction by Chris Franklin, the teachings of which are also expressly incorporated herein by reference.
During the backup procedure, the container manager creates a xe2x80x9csnapshotxe2x80x9d container, a xe2x80x9csnapshottedxe2x80x9d container and a xe2x80x9cbacking storexe2x80x9d container. After the container manager takes the snapshot, the snapshotted container driver processes all input/output (I/O) requests, to store data in or retrieve data from a read-write container. The snapshotted container driver processes all I/O requests to retrieve data from the read-write container by forwarding them directly to the read-write container driver. However for all I/O requests to modify data in a read-write container, the container manager first determines whether the requested block of data has been modified since the time of the snapshot. If the block has not been modified, the container manager copies the data to the backing store container and then sets an associated bit-map flag in a modified-bit-map table. The modified-bit-map table contains a bit-map with each bit representing one block of data in the read-write container. After setting the modified-bit-map flag, the snapshotted container driver forwards the I/O storage request to the read-write container driver.
When the backup process begins execution, it invokes I/O retrieval requests from the snapshot container. A file system, which is a component of the operating system translates the file-oriented I/O request into a logical address and forwards the request to a snapshot container driver. The snapshot container driver checks the associated bit-map in the modified-bit-map table for the requested block of data. If the bit-map is set, the snapshot container driver forwards the request to the backing store container driver to retrieve the unmodified copy of that block from the backing store container. The backing store container driver then processes the backup process retrieval request. If the bit-map is not set, this means that the block has not been modified since the snapshot was created. The snapshot container driver forwards the request to the read-write container driver to retrieve a copy of that block of data from the read-write container. Upon retrieving the file from the backing store container or the read-write container, the backup process backs it up. After a complete backup operation, the container manager deletes the snapshotted container, the snapshot container, the backing store container, and the modified-bit-map table, and thereafter, forwards all I/O requests directly to the read-write container driver.
Many computer systems currently employ the popular Windows(copyright) NT operating system, available from Microsoft of Redmond, Wash., as the framework for the running of resident applications and handling files. The particular file system associated with the NT operating system is termed the NT File System, or NTFS. NTFS, in its current version, is designed to work in conjunction with a backup facility generally configured to backup back to the original read-write storage disk. In doing so, it employs a write function to the disk for purposes of, for example marking and/or archive bit handling. The above-noted archive bits are a specific piece of data that is typically written by a backup facility to a storage disk. The archive bit associated with each file (data, text, etc.) is set and then cleared (e.g. the file is xe2x80x9crecordedxe2x80x9d) to indicate that a backup operation has, in fact, occurred. The archive bit process is inherent to various file systems, but particularly to the NT operating system and associated NTFS.
In general, there are at least three levels of backup that may be performed. The most time-consuming and comprehensive backup is known as a Full Backup, in which every file within a given storage medium is backed-up. In a Full Backup, each file is, likewise, recorded by setting an associated archive bit.
An intermediate level of backup is termed Incremental Backup, in which files that have undergone changes within a certain period (typically, since the last Full Backup) are again backed-up and recorded
A minimal level of backup is termed Differential Backup, in which file changes are backed-up, but no recording of the backed-up files is made during the process.
A file is initially stored with its archive bit set, when a backup occurs, the archive bit becomes cleared by the system, indicating that the file has been backed-up (e.g. recorded. This does not occur with a differential backup.
There is a significant disadvantage to conventional snapshot arrangements that generate a read-only snapshot container, when operating in an NT environment. Simply stated, the NTFS will not accept a disk container to which it cannot write (e.g. the read-only snapshot is unacceptable). Rather than performing the desired backup function, the NTFS, when accessing a read-only snapshot, returns an incompatible disk error message. In other words, any attempt to write to a read-only snapshot to change the archive bit setting is rejected. This makes incremental backups, between full backups (where recording is desired of change files) unavailable. The user must undertake time-consuming full-backups at certain intervals, and perform (unrecorded) differential backups therebetween. However, there is no certainty of which files have and have not been backed up between full backups according to this approach due to the lack of reliable recording.
One technique for performing incremental backup operations in the presence of a read-only snapshot is disclosed in co-pending U.S. Pat. No. 6,101,585, issued on Aug. 8, 2000, entitled Mechanism for Incremental Backup of On-Line Files, by Randall Brown, et al, the contents of which are expressly incorporated herein by reference. This technique entails the modification of the file system, which may not be practical in all circumstances.
However, a technique for creating a read-write backup without modifying the file system is disclosed in co-pending U.S. Pat. No. 6,341,341, published on Jan. 22, 2002, entitled System and Method for Disk Control with Snapshot Feature Including Read-Write Snapshot Half, by Chris Franklin et al., the teachings of which are expressly incorporated herein by reference. The described arrangement particularly enables a snapshot backup to operate in an NTFS, or similar file system, environment by establishing a read-write snapshot container. Appropriate mapping and storage is employed within the snapshot drive/container arrangement to ensure that both original read-write container information and the write data provided by the file system to the snapshot are properly mapped and maintained. This approach, therefore potentially enables the snapshot to be written-to by the file system for to, thereby, manipulate archive bit data within the snapshot and other containers associated therewith.
Accordingly, it is an object of this invention to enable the manipulation of archive bits associated with files that are backed-up in a snapshot backup arrangement of disk storage containers between full backups of the files therein. This system enables the status of files within a snapshotted container to be more accurately ascertained (e.g. changed or unchanged).
This invention overcomes the disadvantages of the prior art by providing a system and method for updating file archive bits in a data storage arrangement that performs snapshot backup operations. The snapshot container is a read-write container that can receive archive bit backup write data from the file system and associated file system. Snapshot container files, in which archive bits have been cleared, indicating a backup, are checked. These files"" counterparts in the snapshotted container are located. Where the snapshotted files have had archive bits cleared, they are passed over. Where snapshotted files have set archive bits, the file data parameters for respective snapshot and snapshotted files are compared. If the file data parameters therebetween are the same, then the respective snapshotted file archive bit is cleared confirming backup status. Where the file data parameters differ, the set archive bit for the snapshotted file is retained, indicating un-backed-up current version in the snapshotted container.
The file data parameters can include file size and last access data for the file. Typically, the archive bit state does not alter the size or access data. Rather a user application so alters these parameters and is indicative of a change that may necessitate backup.
A xe2x80x9cswitchxe2x80x9d can be provided in the user interface to the snapshot backup routine that enables the update of archive bits. Once archive bits have been updated the snapshot can be closed and further incremental or full backups can be undertaken with assurance that all snapshotted files (now returned to the original read-write container) carry the proper archive bit status.