The back-up of data plays a key role in the management of modern data processing infrastructures. Generally, the process of backing-up data consists of making one or more additional copies of an original source to different devices; for safety reasons, the back-up copies are preferably stored in a location different from the one of the original data. In this way, the data is preserved in case of any failure of the infrastructure (such as a memory crash), other disasters (such as a fire), or accidental deletions of corruption; therefore, the data can be restored (i.e., recovered) whenever its original version is damaged or lost.
The back-up process is a routinely activity in large organizations (especially when critical data and/or services are managed). However, the design of a proper back-up process is a very complex task. Indeed, for this purpose a number of aspects must be carefully considered; for example, it is very import to choose the back-up media properly, to ensure the consistency of the copied data, to define an effective back-up procedure, to establish a correct recovery strategy, to verify the integrity and the validity of the back-up copies, and the like.
Additional problems arise in distributed infrastructures wherein the data to be backed-up is stored on multiple computers. Particularly, in this context it is very difficult to control the whole back-up process (because of its inherent scattering); for example, the scheduling of the back-up processes on the different computers is decidedly nontrivial. The problem is more acute in heterogeneous environments, wherein the computers implement different hardware and/or software platforms.
In should be noted that in the above-mentioned scenario it is normally untenable to have the back-up process completely enforced from a central site on the different computers; indeed, in most practical situations the definition of the data to be backed-up must always remain under local control.
Another drawback of the solutions known in the art is that they are quite invasive. Indeed, the back-up process generally requires the availability of a relatively complex agent on each computer (which agent is in charge of performing all the operations required to manage the back-up process). However, the installation of this agent on every computer of the infrastructure is not always viable; for example, this may be prevented because of security concerns.
In any case, it would be desirable to rely on as little human intervention as possible during the back-up process; indeed, any human action is scarcely repeatable, prone to errors and strongly dependent on personal skills. However, the design of an automated back-up process for infrastructures with distributed architecture is hindered by the above-mentioned problems.