Computer networks have become backbones of companies throughout the world. Even if a company does not provide products or services over the internet, computer networks within the company improve employee productivity by providing employees with instantaneous access to millions of bytes of data. In fact, many companies are unable to function when the company's computer network fails. Thus, it is imperative that companies have reliable computer networks with 99.999% up time.
Conventionally, a computer network may be provided with additional resiliency to failures by having a disaster recovery plan. That is, when a failure in the computer network occurs, a plan is available to quickly bring the computer network back to functional status. Disaster recovery plans may include actions taken by one or more actors. For example, a recovery plan may include switching to backup systems at the location of the failure. More drastic disasters may call for switching to backup systems at a location remote from the site of the failure.
However, computer networks often contain many disparate systems. For example, a company may rely on several applications executing on several different servers for information services. Managing the different applications and different servers often require different skill sets. Thus, the company may employ several sets of employees to manage the applications.
Further, the different applications are managed by different control interfaces. Because the control interfaces and applications operate unaware of the status of other applications and servers, it is often difficult to determine when a disaster has occurred. Alerts from each of the different servers may be necessary to understand the status of the computer network and determine that a disaster has occurred. After the disaster is identified, controlling each application and server requires different employees to perform different activities throughout the computer network. The lack of an integrated control interface for interacting with different components of a computer network, such as servers and applications, results in long delays between a disaster occurring, detecting a disaster has occurred, taking actions to recover after the disaster, and returning to normal operation after the disaster.