The present disclosure relates to data transfer, and more particularly relates to high performance transfer of large amounts of data over a network. The present disclosure is applicable to backup and restoration of data, disaster recovery, audio and video transfer, and in general to applications that requires network transfer of data.
The ability to transfer large amounts of data via a network is a limiting factor in various data processing operations. Compression algorithms may be used to provide better utilization of network bandwidth or storage resources. Similarly, source-side de-duplication may be used to remove duplicate data prior to transfer. Processes such as these may be applied to reduce the amount of data sent over a network connection, either by reducing the size of the data prior to sending, or by avoiding retransmission of duplicate data.
However, such methods do not address situations in which no data has yet been transferred to a destination, prior to an initial copying step. Such methods also do not address situations in which unique data needs to be transferred or in which data cannot be efficiently compressed.
Thus, there remains a need for an efficient and economic methods and systems for data de-duplication in networked computer operating environments. Such methods and systems are suitable for use in distributed backup systems, where a plurality of local and remote systems must be backed up, synchronized, and mirrored on a routine basis.