Data storage utilization is continually increasing, causing the proliferation of storage system in data centers. In order to reduce storage space of a storage system, deduplication techniques are utilized, where data objects or files are segmented in chunks and only the deduplicated chunks are stored in the storage system.
Current techniques and systems for storing data do not allow for the efficient identification and analysis of files and other objects in a data stream associated with a stored file system. In particular, current file storage formats do not facilitate the efficient insertion of markers into data streams to assist in the performance of deduplication heuristics based processing.
Additionally, as changes are made to elements of a data stream, the location of files and objects within the data stream change over time. Accordingly, efficient access to each particular object in a data stream for both read and write applications requires a traversal of the entire data stream to locate a desired object. This process results in inefficient management of the storage system as read and write operations are dependent on exhaustive traversal of the data stream.