Presently, the storage market and the wide area network (WAN) optimization market make use of various deduplication techniques in order to save on storage and bandwidth, respectively. In these deduplication techniques, the storage or WAN optimization receives a copy of data once and compares subsequent data to the first copy to determine whether there are similarities. These techniques are effective when multiple users are accessing and storing the same or similar data located in one central location.
For example, assuming ten users have a copy of a 1 megabyte (Mb) power point presentation. Each of the ten users makes minor changes (0.01 Mb) to the presentation and saves these updated presentations to the fileserver. Without deduplication, the fileserver would store ten copies of the original 1 Mb presentation and 0.1 Mb of minor changes to the presentation. In contrast, using deduplication, 1 Mb presentation is recognized as having previously been received so that only one copy of the 1 Mb presentation and the 0.1 Mb of minor changes to the presentation are stored. Accordingly, the storage space is optimized by not storing redundant data. Similarly, deduplication saves on bandwidth because only the data that is different is being transmitted. However, the present deduplication techniques are not effective when data is distributed across different locations and with few users accessing the data at each location.
Similarly, when the data is distributed across multiple locations and multiple people share the data, it is difficult to ensure that only one person is able to write to the data at a given time and it is also difficult to ensure that all the users are able to read the latest copy of the data. Microsoft DFS, for instance, suffers from these problems. In an attempt to overcome these problems, the current solutions involve moving all of the data to centralized locations. However, the main disadvantages of these solutions are slower access performance and an inability to deduplicate the data being accessed from multiple locations.