The present invention relates generally to computer data storage, and more particularly, to systems for backing up and restoring data.
There are a variety of systems for backing up computer data that can be subsequently restored. In general, such systems are used to backup data from a plurality of computers or clients connected to one or more networks. A user, such as a system administrator, can restore selected portions of the previously backed up data to a desired client computer. Thus, the loss of data, which can be contained in large databases, is prevented. To increase the backup system reliability, redundant data storage, e.g., data mirroring, can be provided.
FIG. 1 illustrates a conventional data backup system 10 that can be similar to the system shown and described in U.S. Pat. No. 6,047,294 to Deshayes et al, which is incorporated herein by reference. A computer or client 12 communicates with a storage system 14, such as a Symmetrix storage system provided by EMC Corporation of Hopkinton, Mass. The client 12 can be connected to other devices over a network 16 in a manner well known to one of ordinary skill in the art. A backup storage system 18, which can include tape storage devices for example, is also attached to the network 16 for backing up and restoring data. The system can include a direct connection 20, e.g., a small computer system interface (SCSI), between the backup storage system 18 and the storage system 14 associated with the client 12. Such a system can provide logical-logical backups to facilitate the backup of data when the client and backup storage system have different operating systems.
Known backup storage systems have certain disadvantages associated with restoring backed up data. Generally, a backup storage system provides a list of objects, e.g., files, that can be restored by a user. This is usually provided through a hierarchical representation allowing the user to xe2x80x9cbrowsexe2x80x9d the backed up files for the purpose of selection. The user then selects or marks the desired objects for restoration. The restoration request is then submitted and executed to restore the requested data to the client. The browse, mark, and submit functions are each implemented via separate software and linked together.
One problem with this approach is that a user must serially browse, mark, submit, and wait until execution of the restore operation is complete since the software modules that perform these functions are linked together. It will be appreciated that the time needed to restore relatively large amounts of data, e.g., one Terabyte, can be quite significant. In addition, the restore operation is tied to the catalogs of backed up data, i.e., file information and attributes. Thus, to support new methods of backing up data or new storage devices, the restore system modules must be modified, re-linked, and distributed to existing backup storage systems.
It would, therefore, be desirable to provide a data backup and restore system that restores data independently from browse, mark and submit operations. It would also be desirable to provide a backup and restore system that can readily restore data backed up using new methods and new storage devices. It would further be desirable to optimize the restoration of multiple objects.
The present invention provides a data backup and restore system having an architecture that executes data restores independently of browse, mark and submit operations. In general, a restore engine process is created as part of a restore session. The restore engine process retrieves a list of restorable objects for browsing and marking, stores one or more submitted lists of objects marked for restoration, and executes the restore when commanded by the user. This arrangement allows for multiple submissions of objects marked for restoration by a client prior to execution of the restore. In addition, the system can support new backup methods and storage devices with minimal overall impact on the system since the restore operation is not linked to the browse, mark, and submit operations.
In one aspect of the invention, a data backup and restore system includes a server for backing up data from a client storage system under the control of a user associated with a client. The server creates a restore engine process as part of a restore session initiated by the client. The restore engine process communicates with the client via remote procedure calls such that the execution of a restore operation is independent from browsing of restorable objects, marking of restorable objects for restoration, and submitting a list of objects marked for restoration. The submitted list of objects can persist over time to enable the execution of the restore at a scheduled time. A client can make multiple restore submissions prior to initiating the execution of a restore operation. In addition, multiple restore operations can be executed concurrently.
In one embodiment, a user initiates a restore session with a dispatch daemon running on a server on the backup storage system via a graphical user interface associated with a client coupled to the backup storage system. As the restore session is initiated, the dispatch daemon creates a restore engine process that communicates with the client via remote procedure calls. The restore engine process retrieves a list of restorable objects for browsing by the user by interrogating libraries that support catalogs of backed up data. The user can mark the displayed objects for restoration and submit one or more lists of marked objects to the restore engine process via remote procedure calls. After making the desired submissions, the user executes a system command to restore data. To execute the restore data operation, the restore engine generates a work item restore process on the server, a server restore process for generating a data stream of restore data, and a client restore process on the client for receiving the restore data stream. Upon completion of the restore operation, the restore engine process can be terminated.
In a further aspect of the invention, the backup and restore system can support new methods of backing up data and new storage types by adding one or more libraries to the server that are processed during initialization of the restore process. Each restorable object is associated with a given library. The libraries provide catalog data to the restore engine process in a predetermined format. Since a restore session is created by request of the user, libraries, including newly added libraries, are processed as part of the restore session. This technique allows additional libraries for supporting new restorable objects to be readily incorporated into existing systems. The submit object provides to the plug-in the capability of encapsulating the list of files to be restored by incorporating a list of files to be restored that is independent of the catalogs used to provide the browsable list of files to the user.
In another aspect of the invention, a backup and restore system generates a submit object based upon the objects marked for restoration by the user. The user can make multiple restoration requests, each of which corresponds to a submit object. The submit object is formed prior to execution of the restore. The submit object includes one or more submit elements, which have a one-to-one correspondence with a respective submit file. The submit files share the same set of media, e.g., tapes. In general, the submit object encapsulates the information necessary to restore the requested objects so as to provide functional separation between restore execution and the prior browsing, marking, and submitting of files for restoration.
In a still further aspect of the invention, a backup and restore system optimizes the execution of multiple restore requests. The system includes a work item server, which can form a portion of the restore engine, that assigns the works items contained in the submit objects to a respective one of a plurality of trail queues. In general, the work item server optimizes the running of work item by controlling the numbers of drives used by the restore engine and by setting the reader concurrency for a set of media. In one particular embodiment, work items having a predetermined level of volume commonality are assigned to the same trail queue. The work item server allocates a plurality of drives to the trail queues such that multiple work items can be restored simultaneously. In the case where each drive is allocated, upon restoration of each work item in a trail queue the drive is de-allocated from the trail queue and re-allocated to a further trail queue.