Deduplication storage systems are generally used to reduce the amount of storage space required to store files by identifying redundant data patterns within similar files. For example, a deduplication storage system may divide multiple files into file segments and then identify at least one file segment obtained from one file that is identical to at least one file segment obtained from another file. Rather than storing multiple instances of a particular file segment, the deduplication storage system may store a single instance of the file segment and allow multiple files to simply reference that instance of the file segment to reduce the amount of storage space required to store the files. As such, deduplication storage systems typically only store file segments that are unique (i.e., non-redundant).
In order to prevent stored file segments from being prematurely or erroneously removed, a deduplication storage system may maintain multiple reference objects (such as reference lists and/or reference counts) that each indicate whether one or more backed-up files currently reference a particular file segment. If a reference object indicates that no files are currently referencing a particular file segment, the deduplication storage system may remove that file segment and reclaim the storage space occupied by the same.
Unfortunately, such reference objects are typically stored within a single database that may, over time, become very large and cumbersome. Moreover, in order to update a reference object to account for the files that are currently referencing a particular file segment, a traditional deduplication storage system may need to perform an update operation on the entire database, potentially resulting in unwanted processing delays and limited computing resources. As such, the instant disclosure identifies a need for systems and methods for providing increased scalability in deduplication storage systems.