Disaster Recovery is the process of reinstituting access to data, application, and hardware systems that are critical to resume business operations in the wake of a disaster that has disrupted normal business operations. A Disaster Recovery Plan should include information that not only pertains to the resumption of normal systematic operations post disaster, but should also address any sudden or unexpected disaster by proactively and intelligently scheduling a Disaster Recovery drill or test.
Business enterprises are dependent on computing system environment for maintaining business continuity. Such business enterprises can be broadly termed as IT enterprises, The existence of IT enterprises is dependent on their business continuity and Disaster Recovery Management infrastructure and its effective implementation. IT enterprises generally have large data centres for their production servers at the Production Site. The production servers run application(s) at the Production Site. IT enterprises also maintain Disaster Recovery Site with data centers hosting application(s) that are used in case of disaster at the Production Site.
The configuration of server(s), application(s) and other infrastructure elements at the Production Site are subjected to continuous changes. To have business continuity at the times of disaster or loss of data at the Production Site, IT enterprises keep the disaster recovery data centres updated by replicating the changes occurring at the Production Site to the Disaster Recovery Site. The system and method for replicating the changes made in Production site to the Disaster Recovery Site can be manual, automated or a combination of both.
It has often been seen that the systems and methods used by IT enterprises for replicating the changes made at the Production Site to Disaster Recovery Site fail to replicate certain changes such as, being not limited to, applying patches to the application(s) or changes in configuration of application(s). As a result, the identical replication from the Production Site to the Disaster Recovery Site may not take place.
To overcome such situations, it is a general practice among IT enterprises to test their Disaster Recovery Sites for disaster recovery readiness. The testing of Disaster Recovery Site by IT enterprises to figure out if the Disaster Recovery Site is in sync with the Production Site and if all the relevant changes that have happened on the Production Site have been correctly replicated onto the Disaster Recovery Site is commonly known as Disaster Recovery (DR) drill or Disaster Recovery (DR) test.
DR drill(s)/test(s) are important as system administrators are not sure if all the changes on the Production Site are being fully and correctly replicated on the Disaster Recovery Site. Data centre administrators usually schedule Disaster Recovery (DR) drill(s) or Disaster Recovery (DR) test(s) according to pre-scheduled intervals such as on a quarterly or annual basis for one application or set of applications or entire site,
To this end, IT enterprises generally schedule DR drill(s)/test(s) at fixed intervals like quarterly or annually or on an ad hoc or pre-scheduled basis based on some pre-determined changes for one application or set of application(s) or entire site.
The schedules for such DR drill(s)/test(s) are generally maintained by keeping records thereof either in the documents or sometimes in the task tracking software.
However, such approaches for scheduling DR drill(s)/test(s) in IT enterprises are not system based that keep track of all the changes that have occurred in the Production Site and changes which have been replicated from the Production Site to the Disaster Recovery Site and also do not use such information to schedule DR drill(s)/test(s).
As a result, the time lag between the changes which have not been replicated from the Production Site to the Disaster Recovery Site and the DR drill(s)/test(s) which reflect that changes have not been replicated from the Production Site to the Disaster Recovery Site increases, which can be detrimental to maintain the business continuity of an IT enterprise in a disaster scenario.
Therefore, the present invention aims at overcoming the aforesaid limitations by taking into account the changes occurring at the Production Site, the changes being replicated from the Production Site to the Disaster Recovery Site, results of previous drills/tests, application/server loads, user policies etc. to proactively schedule DR Drill(s)/Test(s) for one application or set of application(s) or entire site.
The present invention proactively and intelligently provides for the IT enterprises to schedule their Disaster Recovery Drill(s)/test(s) for verifying Disaster Recovery readiness of an application or set of application(s) or for entire site so that the time between the changes that can impact Disaster Recovery and Disaster Recovery (DR) drill(s)/test(s) can be minimized. Thus, the present invention intends to reduce the failure chances during real disaster since the DR drill(s)/test(s) schedule is done in such a way that the time window/slot between a change that can cause failure and the real disaster scenario can be significantly minimized/reduced.