Enterprise deployments of large-scale systems may involve frequent changes to the underlying components within the system. For example, software and/or hardware components may be scaled up, scaled down, or scaled out. The state of the enterprise deployment may also change based on the availability of components or the underlying infrastructure. Certain components may become unavailable due to scheduled maintenance, unforeseen device malfunctions, or some other source of failure.
One approach for guarding against unforeseen failures or natural disasters involves data replication. According to this approach, data that is stored at a primary site is copied to a standby site at a geographically different location. If data at the primary site becomes fully or partially unavailable for any reason, then it may be recovered from the standby site. This approach protects data from data loss or corruption stemming from failures, disasters, and/or human error. However, recovery is limited to the storage tier, which may not allow for a full-scale recovery in multi-tier systems. For example a system stack may include, without limitation, applications, middleware, administration servers, web servers, database storage etc. Restoring each layer of the stack after a disaster may be a tedious process involving complex execution and coordination between application, replication, and/or infrastructure experts.
Another approach for disaster recovery is to have a system administrator define custom scripts to perform disaster recovery operations. According to this approach, the administrator may create scripts for different tiers within a multi-tiered system. However, in large-scale systems, it may become extremely difficult and error prone to maintain and update the custom scripts to accommodate frequent changes to the underlying system components. Furthermore, homegrown scripts do not provide a standard, comprehensive set of error management capabilities in the event that a problem is encountered during disaster recovery.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.