The present invention is generally directed to a method and system for backing up file systems. More particularly, the present invention is seen to be especially useful in distributed or parallel data processing systems since its structure makes possible the partitioning of the backup process into a plurality of independent units. Even more particularly the present invention is particularly usable with Storage Area Networks in distributed or parallel data processing systems.
Advances in disk storage have created the capability of managing huge amounts of data and large numbers of files within a single file system. This creates a problem in producing normal backup copies of files in the network because of the difficulties associated with moving sufficient amounts of data and also because of the difficulty associated with identifying which files are to be backed up.
The traditional technique for backing up files involves running a backup application which can run in either full mode or in incremental mode. A full backup (that is, a backup running in full mode, also known as a base backup) backs up the entire file system to a single data sink by reading the entire name tree (see below for a more detailed discussion of the term “name tree” and “name space”) and by transferring copies of each file. An incremental backup transfers new copies of the file for any file which has been created or changed and an incremental backup also makes note of files which have been deleted. Backup copies of deleted files are eventually deleted according to some policy mechanism (for example, retain the backup copy for one month).
There are two problems that exist in the above described situations that are addressed by the present invention: (1) the first problem is the serial nature of backup applications arising from the serial nature of the data and file transfer which unduly restricts data rates that would otherwise be possible; and (2) the second problem is the lack of the capability to rapidly determine which files actually require backing up. Existing techniques for file backup operations typically read the entire name space in the file system hierarchy and extract some file information about each file. This requires that a file system call be executed on every file in the file system. (In data processing systems following Posix file system standards, this call is effected by the “stat( )” command.) Since these calls require information stored on the disk and are done in file name order, they typically result in disk operations having a time “cost” of several milliseconds (ms) each. For example, a file system with 100 million files and a disk capable of reading the file information in 5 ms would require 133 hours to examine each file. Techniques have existed for backup by “inode” (see below for a description of this term which is widely employed to describe certain file system structures) since the early days of the development of the UNIX® operating system (Unix is a registered trademark of The Open Group), but these techniques suffer from the problem that the identity of the file is the inode number which is not a human usable identifier, as opposed to the file name itself which is, in general, recognizable by human file system users.
The only other solution known to these file backup problems exists in file systems which are based on continuous journaling of files that have been changed. However, this solution invokes a program exit every time a file is modified, deleted or renamed which then results in the creation of some form of log that represents the files which need to be backed up. This solution has the advantage that all required information is immediately available at backup time, but it has the cost disadvantage of continually appending information to the log. Furthermore, the appended record may be redundant for files modified more than once, a situation that happens very frequently.
For a better understanding of the environment in which the present invention is employed, the following terms are employed in the art to refer to generally well understood concepts. The definitions provided below are supplied for convenience and for improved understanding of the problems involved and the solution proposed and are not intended as implying variations from generally understood meanings, as appreciated by those skilled in the file system arts. Since the present invention is closely involved with the concepts surrounding files and file systems, it is useful to provide the reader with a brief description of at least some of the more pertinent terms. A more complete list is found in U.S. Pat. No. 6,032,216 which is assigned to the same assignee as the present invention. This patent is hereby incorporated herein by reference. The following glossary of terms from this patent is provided below since these terms are the ones that are most relevant for an easier understanding of the present invention:
Data/File System Data: These are arbitrary strings of bits which have meaning only in the context of a specific application.
File: A named string of bits which can be accessed by a computer application. A file has certain standard attributes such as length, a modification time and a time of last access.
Metadata: These are the control structures created by the file system software to describe the structure of a file and the use of the disks which contain the file system. Specific types of metadata which apply to file systems of this type are more particularly characterized below and include directories, modes, allocation maps and logs.
Directories: these are control structures which associate a name with a set of data represented by an inode.
Inode: a data structure which contains the attributes of the file plus a series of pointers to areas of disk (or other storage media) which contain the data which make up the file. An inode may be supplemented by indirect blocks which supplement the inode with additional pointers, say, if the file is large.
Allocation maps: these are control structures which indicate whether specific areas of the disk (or other control structures such as modes) are in use or are available. This allows software to effectively assign available blocks and inodes to new files. This term is useful for a general understanding of file system operation, but is only peripherally involved with the operation of the present invention.
Logs: these are a set of records used to keep the other types of metadata in synchronization (that is, in consistent states) to guard against loss in failure situations. Logs contain single records which describe related updates to multiple structures. This term is also only peripherally useful, but is provided in the context of alternate solutions as described above.
File system: a software component which manages a defined set of disks (or other media) and provides access to data in ways to facilitate consistent addition, modification and deletion of data and data files. The term is also used to describe the set of data and metadata contained within a specific set of disks (or other media). While the present invention is typically used most frequently in conjunction with rotating magnetic disk storage systems, it is usable with any data storage medium which is capable of being accessed by name with data located in nonadjacent blocks; accordingly, where the terms “disk” or “disk storage” or the like are employed herein, this more general characterization of the storage medium is intended.
Snapshot: a file or set of files that capture the state of the file system at a given point in time.
Metadata controller: a node or processor in a networked computer system (such as the pSeries of scalable parallel systems offered by the assignee of the present invention) through which all access requests to a file are processed. This term is provided for completeness, but is not relevant to an understanding of the operation of the present invention.