In modern computer systems, a file system stores and organizes computer files to enable a user to efficiently locate and access requested files. File systems can utilize a storage device such as a hard disk drive to provide local access or provide access to data stored on a remote file server. A file system can also be characterized as a set of abstract data types that are implemented for the storage, hierarchical organization, manipulation, navigation, access, and retrieval of data. The file system software is responsible for organizing files and directories.
Many companies and individuals with large amounts of stored data employ a file system as a data storage system. These data storage systems can be located local to the data to be backed up or at a remote site. The data storage systems can be managed by the entity controlling the primary data storage devices or a data storage service company. Data can be added to the storage system at any frequency and at any amount.
Data in a data storage system can be arranged hierarchically in the storage system, which is particularly necessary when the amount of data exceeds the available main memory. Consequently, auxiliary memory can be employed to accommodate large amounts of data in a data storage system. Auxiliary memory is not accessible by a computer's central processing unit (CPU), but can be read into CPU main memory in portions so that the data can be manipulated. Auxiliary memory can extend to storage that must be mounted (either automatically or manually) to be read into a CPU's main memory.
Data is represented in a data storage system by a series of bits. The bit representation of data is frequently expensive in the areas of disk space and transmission bandwidth. Therefore, it is beneficial to encode the data using fewer bits than the original representation would use. One data compression scheme is delta encoding; delta encoding involves storing some portion of data as the relative difference to another portion of data. Delta encoding can be implemented some different ways, but a typical issue with delta encoding is how to select which portion of data should be encoded and relative to what other portion of data it should be stored. Consequently, delta encoding results in a partitioned data file requiring reassembly as a whole when accessed, as well as reassembly of those portions of the data stored as the relative difference to another portion. To enable this process, a file has a recipe for reconstruction, which typically consists of a list of fingerprints and related information corresponding to unique data chunks (i.e., fractional components of the data as a whole) stored in the data storage system.