The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Present information handling systems often take advantage of various data storage technologies, such as a redundant array of independent disks (RAID), which is a storage technology combining multiple disk or other drives into a logical storage unit. The use of RAID technology can improve data redundancy and performance. Data may be distributed across the drives in several ways, referred to as RAID levels. The RAID level utilized may depend on the specific level of redundancy and performance required. Each level provides a different balance between reliability, availability, performance, and capacity of the information handling system.
An increasing problem with such information handling systems, and particularly those employing more complex storage technologies, is the wasted storage space taken up by duplicate data. Accordingly, procedures for data deduplication (also referred to herein simply as “deduplication”) have become increasingly desirable and/or important. Data deduplication is a technique where files, or other units of stored data, with identical contents are first identified, and then only one copy of the identical contents, the single-instance copy, is kept in the physical storage while the storage space for the remaining identical content can be reclaimed and reused. Thus, deduplication achieves what is called single-instance storage, where only the single-instance copy is stored in the physical storage, along with one or more references to the unique single-instance copy, resulting in more efficient use of the physical storage space.
As may be appreciated, therefore, deduplication may reduce the required storage capacity since less duplicate data is stored. Moreover, deduplication can lead to a “domino effect” of efficiency, reducing for example capital, administrative, and facility costs, as well as, for example, reducing energy use, cooling needs, and overall carbon footprint of the system. Also, less hardware may need to be purchased, recycled, and/or replaced, further lowering costs.
On the other hand, however, deduplication is conventionally a random access memory (RAM) limited feature and requires CPU time that could otherwise be utilized for other processing tasks, such as input/output operations. Thus, inefficient deduplication procedures could, for example, decrease the input/output operations per second (IOPS). Thus, there remains a need for further improvement, and incorporation of additional efficiencies, to deduplication procedures for an information handling system.