A computing device can synchronize its data with another device which maintain a mirror copy of the data. The process of data synchronization establishes consistency among the data from the these two devices. One example of data synchronization is file synchronization.
One way to synchronize two files from different devices is to transfer the entire file to another device so that the files can be compared locally. But this method may waste network bandwidth for transferring portions of the file that are identical to the counterparts of the other file. Another way of file synchronization is to determine which portion of a file is different from another file and only transfer the different portion. A first device can split a first file into fixed-size non-overlapping chunks and compute checksums for each chunk. The first device sends the checksums of the first file to a second device. Similarly, the second device can split a second file into fixed-size non-overlapping chunks and compute checksums for each chunk of the second file. The second device then compares the received checksums of the first file with the checksums of the second file. If any of the checksums of the first file do not match their counterpart checksums of the second file, the second device detects a data chunk containing a difference between the first and second files. In order to synchronize the data from the first and second files, the first device only needs to send data chunks that are identified as containing the difference.
Such a method still requires reading all portions of the file in order to generate the checksums. The process of generating the checksums can be expensive if the file size is large.