Companies and organizations are both generating and collecting more and more data at an ever increasing rate. This is due, in part, to the rise of enterprise applications, portable and mobile computing devices, wearable computers, social media, analytics, business intelligence, electronic health records, and e-commerce, among other growth areas.
Backing up files and other data helps to protect against accidental loss, data corruption, hardware failures, and natural disasters. The size of a typical backup for a company or organization can be many terabytes because of the vast amount of data available to be backed up. The data may include relational databases, mail, file systems, virtual systems, customer data, employee data, documents, video, and much more. We are in an age where backup data sets are growing increasingly large and complex. Take an example of Hyper-V, Exchange where a backup may include backing up not only large data sets but also managing the complexity of multiple nodes which may require backing up using multiple proxy nodes.
Despite having a backup copy it would be desirable to provide further redundancy in case the backup copy itself is compromised. Cloning is a process of creating multiple copies of the backup data. Another approach to create multiple or two backup copies is referred to as twinning in which two targets are written to at the time of the backup.
Both approaches have drawbacks. With twinning, the negative side-effect is that primary time-to-protect is slowed down to match time to write on either of the targets. In the case of one target being local to the system being protected and the second copy (clone) being remote, it is expected that the remote copy will take a longer time to write to and it is not desirable to slow down initial backup time to the time of the remote copy.
Cloning is a slow process as it is predominantly sequential. Techniques such as striping are not considered true block level cloning. Cloning can be a very time-consuming and computing resource intensive process because of the potentially large size of the primary backup and other complexities. These other complexities may include the formatting of the primary backup. For example, in some cases the primary backup copy may be stored as a virtual hard disk (VHD) or VHDx format. These formats are space efficient, but generally cannot be accessed using traditional techniques and thus present challenges.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. EMC, Data Domain, Data Domain Restorer, and Data Domain Boost are trademarks of EMC Corporation.