According to one estimate, the size of the digital universe in 2007 was two hundred eighty one billion gigabytes. The estimate goes on to note that the digital universe had a compound annual growth rate of almost sixty percent. With so much information being generated, the need for efficiently backing up information is increasing.
A traditional approach to improving a backup system's performance is data compression. Data compression is the process of encoding raw data such that the encoded data consumes less storage capacity than the raw data.
Another approach to improving a backup system's performance is deduplication. Deduplication removes the redundancy commonly found in all types of data. Examples of such redundancy include multiple copies of the same file in a storage device. By storing only a single instance of the file and using pointers to reference that single instance, deduplication helps to reduce the amount of storage capacity consumed by data.
The above approaches, among others, are useful in reducing the consumption of resources, such as hard disk space or network bandwidth. However, typical backup applications are simply unequipped to handle compressed data. In order for backup applications to understand compressed data, the compressed data must be decompressed. The resulting decompressed data may consume significantly more network bandwidth and storage capacity. There is a need, therefore, for an improved method, article of manufacture, and apparatus for backing up information.