Data storage systems are arrangements of hardware and software that include storage processors coupled to arrays of non-volatile storage devices, such as magnetic disk drives, electronic flash drives, and/or optical drives, for example. The storage processors service storage requests, arriving from host machines (“hosts”), which specify files or other data elements to be written, read, created, deleted, and so forth. Software running on the storage processors manages incoming storage requests and performs various data processing tasks to organize and secure the data elements stored on the non-volatile storage devices.
Data storage systems commonly store data in blocks, where a “block” is a unit of storage allocation, such as 8 KB or some other size, which is typically uniform. A data storage system may arrange blocks into larger structures, such as LUNs (Logical UNits), file systems, and the like.
Some data storage systems employ deduplication. For example, a program searches a storage system for data blocks having identical values. The program then replaces duplicate blocks with pointers to a single retained copy. Deduplication can save considerable storage space in systems that would otherwise store multiple copies of the same data. Consider, for example, an email server at a company where all employees receive the same message and attachments. Deduplication enables a data storage system to store such messages and attachments while consuming a minimum of storage space.