In many data systems, broadly viewed, a sender (a data source) uploads data to a receiver (a data processor) via a communications channel. An example of such a system is a data storage system; however, these data systems may include any system in which a receiver somehow processes data uploaded from a sender. The uploaded and processed data may include, but is not limited to, any type of textual, graphical, or image data, audio data (e.g., music and voice data), video data, compressed and/or encrypted data, and so on. In many such systems, large amounts of data may need to be uploaded from the sender to the receiver via the communications channel. However, communications channels generally have bandwidth constraints, while a goal of such data systems is to get as much usable data across the communications channel to the receiver as possible.
Data deduplication refers to techniques for reducing or eliminating redundant data in such systems, for example to improve storage utilization in a data storage system (referred to as data deduplication) and/or to reduce bandwidth usage on the communications channel (referred to as network data deduplication, or simply network deduplication). As an example, in at least some data deduplication techniques applied to data storage systems, the storage of duplicate data to a data store may be prevented. To achieve this, units of data that already reside in the data store, and/or units of data that do not reside in the data store, may be identified, and only the units that do not reside in the data store are stored or updated in the data store. Data deduplication in this application may thus reduce required storage capacity since fewer or only one copy of a particular unit of data is retained.
While embodiments are described herein by way of example for several embodiments and illustrative drawings, those skilled in the art will recognize that embodiments are not limited to the embodiments or drawings described. It should be understood, that the drawings and detailed description thereto are not intended to limit embodiments to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope as defined by the appended claims. The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including, but not limited to.