1. Technical Field
The present invention relates to systems and methods for deduplicating data in electronic systems.
2. Related Art
Disks provide an easy, fast, and convenient way for backing up datacenters. As additional backups are made, including full, incremental, and differential backups, additional disks and disk space are required. However, disks add costs to any backup solution including the costs of the disks themselves, costs associated with powering and cooling the disks, and costs associated with physically storing the disks in the datacenter.
Thus, it becomes desirable to maximize the usage of disk storage available on each disk. One method of maximizing storage on a disk is to use some form of data compression. Software-based compression can be slow and processor-intensive, therefore hardware-accelerated compression came to be used. However, using data compression can achieve a nominal compression ratio of 2:1, which only slows the need to add additional disk storage.
Data deduplication provides another method of capacity optimization which can reduce the storage capacity required for a given amount of data. This in turn can reduce acquisition, power, heating, and cooling costs. Additionally, management costs can be reduced by reducing the number of physical disks required for data backup.
Data deduplication can be performed in-line or in post-processing. In-line data deduplication is performed in real time, as the data is being written. Post-processing occurs after data has been written to a non-deduplicating disk but before the data is committed to a permanent medium. Post-processing requires the full backup to be stored temporarily, thus defeating the storage benefits of deduplication.