In modern computer systems, a file system stores and organizes computer files to enable a user to efficiently locate and access requested files. File systems can utilize a storage device such as a hard disk drive to provide local access or provide access to data stored on a remote file server. A file system can also be characterized as a set of abstract data types that are implemented for the storage, hierarchical organization, manipulation, navigation, access, and retrieval of data. The file system software is responsible for organizing files and directories.
Many companies and individuals with large amounts of stored data employ a backup file system. These backup file systems can be located local to the data to be backed up or at a remote site. The backup file systems can be managed by the entity controlling the primary data storage devices or a data storage service company. Data can be added to the storage system at any frequency and at any amount.
A data storage system can implement data deduplication techniques to improve data compression in a backup file system. Data deduplication is an approach to data compression that involves reducing the amount of duplicate data maintained within a file system. To realize this data compression, unique sections of data—e.g., byte patterns or bit patterns—are identified before being stored in the file system so that only the unique data sections are stored. A duplicate data section can be replaced with a pointer to the existing unique data section so that the duplicate data section is not stored in the file system. Accordingly, the volume of data stored or processed in a file system can be reduced.