As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Data de-duplication is a process by which a footprint of data on a storage system may be reduced by identifying and eliminating redundant copies of similar data with storage resources of a storage system. Traditionally, in order to identify duplicate data, items (e.g., files, portions of files, etc.) of data are fingerprinted (e.g., by applying a hash function, cryptographic function, or other function) and such fingerprints are stored in a structure, sometimes referred to as a dictionary, that allows for quick lookup and insertion in the event an item of data has not been encountered before. When duplicate data is identified redundant copies may be eliminated and other structures are updated to ensure that consistency of the data is maintained through additions and deletions. A monolithic dictionary is often suitable in the case of the storage system that does not provide scalability, but may cause problems with performance and scalability in clustered scale-out storage systems and other storage systems.