Business continuity and disaster recovery refers to the capability to restore normal (or near-normal) business operations, from a critical business application perspective, after the occurrence of a disaster that interrupts business operations. Business continuity and disaster recovery may require the ability to bring up mission-critical applications and the data these applications depend on and make them available to users as quickly as business requirements dictate. In cases where downtime is costly, the process may involve automation. For mission-critical applications that demand minimal downtime, the disaster recovery process may need to be highly automated and resilient. Clustering technologies may provide such highly automated and resilient disaster recovery.
Clusters may include multiple systems connected in various combinations to shared storage devices. Cluster server software may monitor and control applications running in the cluster and may restart applications in response to a variety of hardware or software faults. For failover service groups running in traditional clusters, the time to failover includes the time needed to take offline all the resources of the service group from the failed node plus the time needed to bring online all the resources of the service group on the failover node. Unfortunately, the time required to take a service group completely offline and then bring the service group back online may result in failure to comply with a service level agreement. Accordingly, the instant disclosure identifies and addresses a need for additional and improved systems and methods for performing failovers.