The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for caching backed-up data locally until replication of the backed-up data is successful.
Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. Two main types of replication are asynchronous replication and synchronous replication. Asynchronous replication is a “store and forward” approach to data backup. Asynchronous replication writes data to a storage array first and then, depending on the implementation approach, commits data to be replicated to a primary storage site. Asynchronous replication then copies the data in real-time or at scheduled intervals to a secondary storage site. However, one downside to asynchronous replication is the possibility of data loss if the primary site should happen to fail before the data has been written to the secondary site. In contrast, synchronous replication writes data to a primary site and a secondary site at the same time so that the data remains current between sites. However, synchronous replication is more expensive than other forms of replication and introduces latency that slows down the primary application.
In both asynchronous replication and synchronous replication, if there is a failure in a backup or a replication process, the most widely utilized recovery solution resets backups at the secondary site by performing a full backup. In fact, when using a full+incremental or a full+differential backup scheme, a full backup is a requirement. However, performing a full backup may not be acceptable to customers, such as in systems that rely upon, for example, journaling, which is a method to keep track of file system changes and are generally so large that it may not be possible to scan the files. The same is true for databases that may be many terabytes or petabytes in size and backing such databases up in the event of a failure is not reasonably feasible.