The enterprise computing landscape has undergone a fundamental shift in storage architectures in that central-service architecture has given way to distributed storage clusters. As businesses seek ways to increase storage efficiency, storage clusters built from commodity computers can deliver high performance, availability and scalability for new data-intensive applications at a fraction of the cost compared to monolithic disk arrays. To unlock the full potential of storage clusters, the data is replicated across multiple geographical locations, thereby increasing availability and reducing network distance from clients.
Data de-duplication can identify duplicate objects and reduce required storage space by removing duplicates. As a result, data de-duplication is becoming increasingly important for a storage industry and is being driven by the needs of large-scale systems that can contain many duplicates.