There are many scenarios where data is mirrored, replicated, synchronized, etc. For example, different web servers may each serve an identical copy of a set of web pages. When a master of the set changes, the copies of the set need to be updated to match the master set. When a software package is revised, the latest revision may need to be propagated to a number of systems that distribute duplicate copies of the package. A news bulletin that changes frequently over time may need to be quickly updated on a number of clients, each of which may have a different outdated version of the news bulletin. Storage devices may also be synchronized. A network router may need to update other routers with a latest routing table. Any system providing a source or master dataset will be referred to as a sender and any system receiving difference or update information from a sender will be referred to as a receiver. A dataset can be any arbitrary type of data, such as a file, a file system directory, a set of one or more web pages, a BLOB, a data structure, etc.
In some cases, a receiver with a dataset that needs to be updated may send feedback to a sender indicating the differences between the receiver's dataset and the sender will use that feedback to provide the receiver with individually tailored update information that the receiver can use to update its version of the dataset to match the sender's master version of the dataset. However, in some situations it may be impractical or impossible for a receiver to provide a sender with clues or feedback about the particular data that the receiver needs to update its copy of the dataset. For example, if the sender is a server on a data network such as the Internet, the sender may not be able to handle the overhead needed to form individual bi-directional connections with a large number of clients (receivers); one-way broadcasting may be the only means of propagating update information to clients. If a one-way communication medium is being used, for example broadcast radio, then feedback will not be possible. Whether feedback is possible or not, and regardless of the application, there is a general need to minimize the amount of information that a receiver or client needs to receive in order to be able to compare or update its version of a corresponding file, dataset, table, data store, etc. There is also a need to minimize the bandwidth used to update multiple receivers. Minimizing the amount of delta or update information can conserve network bandwidth, reduce the active listening time of a wireless device, conserve battery energy, and reduce the time that it takes to bring a receiver's version up to date.