In a distributed computing system, different computers, operating systems, and networks interact as if they were all part of a single system. The file system has a single set of global file names. A particular machine in the system need not know where the file is physically located. Instead, the file may be accessed anywhere in the network using the global file name. Global file names are part of the shared name space which devices within the distributed file system may access. One such distributed file system is the Andrew File System (AFS) available through Transarc, Corporation ("Transarc"). An AFS server performs file mapping between the directory name of a file and the location, making the file space location independent. With file independence, a user at a workstation linked to the network need only know the global file name, which includes the path name, and not the physical location where the file resides.
Another distributed system, is the Distributed File System (DFS), available from Transarc and International Business Machines, Corp. (IBM), which is a component of the Distributed Computing Environment (DCE) standard promulgated by the Open Software Foundation (OSF). IBM is the assignee of the subject patent application. The DFS and AFS systems allow users to access data throughout the network. Any changes made by one user to a file is available to all users. The DFS and AFS systems include security services that provide authentication to limit access to authorized users.
The AFS system offered by Transarc includes a backup program called "butc" (Backup Tape Coordinator). Butc is a volume backup system used to dump volume images to tape devices attached to the file server. However, the minimum backup unit for the butc program is a volume as the butc program does not provide support for file-level backup and recovery.
Hierarchical storage management programs, such as the IBM Adstar Distributed Storage Management (ADSM) product, provide backup/archive support and migrates less frequently used files to backup storage to free space. The ADSM server provides hierarchical storage management backing files up on tape drives, optical disks, and other storage medium. The ADSM backup feature saves copies of files from a client computer to a storage space managed by an ADSM server. Thus, data at a client computer running an ADSM client is protected in the event of data loss due to a hardware or software failure, accidental deletion, and/or logical corruption. With the ADSM program, clients can backup volumes, directories, subdirectories or files. ADSM allows incremental backup of only those files that have been changed. In this way, ADSM avoids the need to do a full dump to backup as only those modified files are backed up. This incremental backup reduces network utilization and traffic. The IBM ADSM product is described in "ADSM Version 2 Presentation Guide," (IBM Document SG24-4532-00, International Business Machines, copyright 1995), which publication is incorporated herein by reference in its entirety.
IBM has combined the ADSM product with AFS and DFS file servers to provide backup support for these products. An AFS or DFS server would include an ADSM client to transfer files to an ADSM server, which then backs up the files in a storage device managed by the ADSM server. One problem with using such backup software in a distributed file system is that the client managing backup operations, such as the ADSM client, must read a file to be backed-up. This reading operation consumes network resources. The ADSM client must then consume network resources again by transferring the file it has read from the file server to the ADSM server. Network traffic is further increased if the ADSM client is on a separate machine from the AFS/DFS server. The IBM publications entitled "ADSM AFS/DFS Backup Clients Version 2.1" (IBM Document SH26-4048-00, International Business Machines, copyright 1996) and "ADSM Concepts" (IBM Document SG24-4877--00, International Business Machines, copyright 1997) describe the use of the ADSM software in an AFS/DFS distributed file system. These publications are incorporated herein by reference in their entirety.
Network traffic can be significantly increased if the AFS/DFS server and backup server are in one physical location, i.e., San Jose, Calif., and the AFS/DFS client and backup client requesting to backup a file in the AFS/DFS server are in a distant geographical location, i.e., Tucson, Ariz. If a user in Tucson wanted to backup a file that resided in the global name space managed by the AFS/DFS server in San Jose, prior art client/server protocol would have the AFS/DFS client in Tucson read the file, which requires transmittal of the file from San Jose to Tucson over the network, and then send the file back to the backup server in San Jose for backup storage. Such network traffic problems are exasperated when the client requesting the backup is separated by a long geographical distance from the server.