Field of the Invention
This invention relates to the field of computer storage systems, and more particularly, to a deduplication storage system with efficient reference updating and space reclamation.
Description of the Related Art
The amount of data used by computer systems is increasing at a faster and faster rate. As a result, it is necessary to find ways to reduce the amount of storage space required to store the data. One way to do this is through deduplication. Many files, or portions of files, are duplicate copies of each other. Instead of storing multiple copies of the same data segment, a deduplication storage system can store a single copy of a data segment and maintain metadata specifying which files use the data segment. Thus, a single instance of a given data segment can be referenced by multiple files.
Eventually, some of the data segments may no longer be needed, e.g., because all the files that use those data segments may be deleted from the storage system. When this happens, it is desirable to reclaim the storage space taken by those data segments, e.g., so that the space can be re-allocated for new data segments added to the system. Thus, it may be necessary for the deduplication storage system to maintain reference information to keep track of which data segments are used by which files. For large data systems that store many data segments, it can be difficult to both efficiently maintain the reference information and efficiently reclaim the storage space when segments are no longer needed.