1. Field of the Invention
This invention relates to systems and methods for computer storage. Particularly, this invention relates to systems and methods for managing data backup in a distributed application environment.
2. Description of the Related Art
The requirements for computer software applications (such as large databases) have demanded ever improving response time, scalability, and rapid data growth over the years. In order to accommodate these requirements, the application deployment model has evolved from running an application on a single host using direct attached storage to a distributed environment where the application workload is distributed over multiple hosts using a centralized storage model. In such newer application environments each node typically operates autonomously with a higher logical entity performing some level of application coordination. An example of such an application is found in the IBM DB2 Enterprise Server Edition (ESE). In this application each node in the DB2 ESE essentially operates as an individual database and unaware of the existence of the other nodes. A catalog node then acts as the coordinating entity in this distributed environment.
The DB2 Universal Database (UDB) Enterprise Server Edition (ESE) is designed to meet the relational database server needs of mid- to large-size businesses. The application can be deployed on Linux, UNIX, or Windows servers of any size, from one CPU to hundreds of CPUs. DB2 ESE can operate as a foundation for building on demand enterprise-wide solutions, such as large data warehouses of multiple terabyte size or high performing 24×7 available high volume transaction processing business solutions, or Web-based solutions. The application can operate as the database backend for ISVs building enterprise solutions, such as, Business Intelligence, Content Management, e-Commerce, ERP, CRM, or SCM. Additionally, DB2 ESE can provide connectivity, compatibility, and integration with other enterprise DB2 and Informix data sources.
Operating with a distributed application node computing environment, existing data protection solutions back up data of each application node operating as an independent backup application. Thus, the federation of backup data is limited to one node only. Fundamentally, it cannot be used in a distributed application environment for federated backup because there is no higher level of knowledge of the distributed data. The off-loaded data movement capability may exists but it is not flexible; entire data movement can be delegated to only a single system. A range of backup systems and methods have been developed without fully addressing this issue.
U.S. Patent Application 20050021869 by Aultman et al., published Jan. 27, 2005, discloses a data backup and recovery system that includes a backup and recovery (EBR) management network system. The EBR management network system includes an infrastructure for performing information storage, backup, and recovery operations for a business enterprise that is fully scalable and sharable. The EBR management network system includes the following modular backup and recovery models: (1) LAN network based backup and recovery models for applications requiring <200 GB; (2) LAN network based GigE backup and recovery model for applications requiring >500 GB and <1.5 TB; LAN-Free dedicated tape drive backup and recovery models; (3) LAN-Free shared tape drive backup and recovery models; (4) Server-Free backup and recovery models; and (5) application storage manager (ASM) backup and recovery models.
U.S. Patent Application 20040153698 by Guzman et al., published Aug. 5, 2004, discloses a system and method of disaster preparedness and restoration of service of damaged or destroyed telecommunication network elements. A computer-implemented method of disaster backup for network elements includes establishing connectivity to a plurality of network elements. A host computer may transmit one or more commands to the network elements for invoking a computer routine to create a plurality of computer readable service continuity data to a local memory of the network elements. An automated system of computer executable components for disaster recovery of network elements includes a computer executable controller component that is configured to select a plurality of network elements designated for disaster backup action. A computer executable engine component is configured to establish connectivity to the plurality of network elements and to transmit one or more commands to the network elements so as to replicate service continuity data for each of said network elements.
U.S. Pat. No. 6,424,999 by Arnon et al., issued Jul. 23, 2002, discloses a system comprising a mass storage subsystem, as a master device and backup subsystem, as a slave device, the slave device transferring data items to the master device during a restore operation. Each data item to be restored is identified by a data item identifier. The master device initially provides the slave device with a list of data item identifiers and the slave device receives the data item identifier list from the master device and order the data item identifiers thereon in an optimal order for transfer to the master device, the ordering based on the ordering of the data items on storage media on which they are stored. The master device, in each of a plurality of iterations, receives from the slave device data item identifiers identifying a data item transferred during a previous iteration and a data item to be transferred during the current iteration and requests the slave device to transfer the data item to be transferred. The master device uses the data item identifier that it receives from the slave device identifying the data item transferred during the previous iteration to confirm that the data item corresponded to the data item that was to have been transferred during the previous iteration. The slave device, in each iteration, provides the master device with the data item identifiers identifying the data item transferred during the previous iteration and the data item to be transferred during the current iteration, and transfers the data item to be transferred during the iteration to the master device when requested by the master device.
U.S. Patent Application 20050172093 by Manmohan, published Aug. 4, 2005, discloses a system for backing up and restoring information, includes at least one computer system including information to be backed up and restored, a storage device for receiving at least part of the information to be backed up and for storing and backing up the information. A controller includes a scheduling system for allowing a user to input into a job queue, a master job indicating one or more portions of the information of the at least one computer system to be backed up or restored, and a job control system that splits the master job into a plurality of smaller jobs and inputs the plurality of smaller jobs into the job queue.
U.S. Patent Application 20050071588 by Spear et al., published Mar. 31, 2005, discloses a method, system, and program for forming a consistency group of data. Information is provided on a consistency group relationship indicating a plurality of slave controllers and, for each indicated slave controller, a slave storage unit managed by the slave controller. A command is transmitted to each slave controller in the consistency group relationship to cause each slave controller to transmit data in the slave storage unit to a remote storage in a manner that forms the consistency group. A determination is made as to whether all the slave controllers successfully transmitted the data in the slave storage units that is part of the consistency group to the remote storage.
However, there is still a need in the art for systems and methods to provide an optimal backup solution for a distributed storage application operating across a plurality of interconnected hosts. There is a need in the art for backup systems and methods that facilitate a backup application distributed across more than one host computer. There is further a need in the art for such systems and methods to off-load backup operations to one or more hosts. In addition, there is a need in the art for such systems and methods to provide for both distributed application data and off-loaded backup operations to one or more hosts. As detailed hereafter, these and other needs are met by embodiments of the present invention.