1. Field of the Invention
The following generally relates to disaster recovery for computing systems, services, and/or data, and more particularly, to a method, apparatus and system for improving failover within a high availability disaster recovery environment.
2. Description of the Related Art
Uninterrupted continuity of business functions is vital to attain an edge in the competitive market of today's world. Various business groups, such as data centers, production factories, stock exchanges, financial or banking companies, and other entities, have a need for a certain absolute degree of operational continuity during their operations To meet such objective, which is commonly referred to as “business continuity,” the business groups generally rely “high availability” (“HA”) computing services to service needs of their employees, customers, general members of the public and/or others (collectively “clients”). These business groups typically employ, use, are provided with or otherwise take advantage of HA-computing systems to provide such HA-computing services, and in turn, provide seemingly uninterrupted availability of data (“data availability”) to the clients.
To facilitate providing the HA-computing services (and the seemingly uninterrupted data availability), each of the HA-computing systems employ a number of computing resources (e.g., hardware resources and/or software resources). These computing resources typically include computing resources for operating the HA-computing services (“operating resources”) and computing resources redundant to the operating-computing resources (“redundant resources”) along with protocols (“disaster-recovery protocols”) for recovering from a failure.
The failure may include and/or result from one or more man-made and/or natural disasters, including, for example, human errors; power failures; damaged, corrupt, unavailable and/or failed computing resources; earthquakes; floods, etc., effecting one or more of the HA-computing services operating on the operating resources. Generally, the HA-computing services failover to the redundant resources in accordance with the disaster-recovery (“DR”) protocols in response to an occurrence of the failure.
The DR protocols generally include multiple factors, such as specific applications (e.g. critical applications), multiple standards defined by service-level agreements (“SLA”), compliances (e.g., data recovery compliance, business compliance) and the like. An administrator of the HA-computing system typically carries out the DR protocols during the normal computing operation. However, when the administrator neglects such DR protocols anomalies associated with the protection provided by the DR protocols may arise. These anomalies may affect the ability of the HA-computing systems to failover properly, or worse yet, failover at all (i.e., an abortive failover), and thereby fail to meet requirements of the business continuity.