1. Field of the Invention
This invention is related to the field of computer systems and, more particularly, to backup and disaster recovery mechanisms in computer systems.
2. Description of the Related Art
Computer systems, and their components, are subject to various failures which may result in the loss of data. For example, a storage device used in or by the computer system may experience a failure (e.g. mechanical, electrical, magnetic, etc.) which may make any data stored on that storage device unreadable. Erroneous software or hardware operation may corrupt the data stored on a storage device, destroying the data stored on an otherwise properly functioning storage device. Any component in the storage chain between (and including) the storage device and the computer system may experience failure (e.g. the storage device, connectors (e.g. cables) between the storage device and other circuitry, the network between the storage device and the accessing computer system (in some cases), etc.).
To mitigate the risk of losing data, computer system users typically make backup copies of data stored on various storage devices. Typically, backup software is installed on a computer system and the backup may be scheduled to occur periodically and automatically. In many cases, an application or applications may be in use when the backup is to occur. The application may have one or more files open, preventing access by the backup software to such files.
Some backup software may include custom code for each application (referred to as a “backup agent”). The backup agent may attempt to communicate with the application or otherwise cause the application to commit its data to files so that the files can be backed up. Often, such backup agents make use of various undocumented features of the applications to successfully backup files. As the corresponding applications change (e.g. new versions are released), the backup agents may also require change. Additionally, some files (such as the Windows registry) are always open and thus difficult to backup.
Disaster recovery configurations are used in some cases to provide additional protection against loss of data due to failures, not only in the computer systems themselves but in the surrounding environment (e.g. loss of electrical power, acts of nature, fire, etc.). In disaster recovery configurations, the state of data may periodically be checkpointed from a first computer system to a second computer system. In some cases, the second computer system may be physically located distant from the first computer system. If a problem occurs that causes the first computer system to go down, the data is safely stored on the second computer system. In some cases, applications previously running on the first computer system may be restarted on the second computer system to allow continued access to the preserved data. The disaster recovery software may experience similar issues as the backup software with regard to applications which are running when a checkpoint is attempted and the files that the applications may have open at the time of the checkpoint. Additionally, replicating all the state needed to restart the application on the second computer system (e.g. the operating system and its configuration settings, the application and its configuration settings, etc.) is complicated.