Version chains are linear/serial data structures that can hold the contents of versions of the same or similar data over time. For example, a file that is originally created and then modified four times can be represented as a linear reverse delta version chain, where the most recent version can be stored in its whole form and all earlier versions can be stored as difference/delta files from each other, connected in a linear fashion into a version chain. While linear arrangement of delta versions can be one of the simplest data structures for version chains, there are processing operations on the delta files that make up version chains that can make the linear arrangement inefficient and cumbersome. For example, when an end-user requests a restoration of an early version of a file from version chains that contain backup data, the existing conventional methods of restoring one or more files can be slow, serial processes whose processing time can be directly proportional to the “distance” the version to be restored is located at with respect to the base file of the version chain (i.e., least or most recent). The farther the version is located, the more time it will take to restore that version, as it involves un-delta-compressing each pair of files from the most recent version backwards. For chains that have thousands of delta version files, restoring times can easily extend into many minutes/hours. Such restoration can strain both computing as well as disk input/output (“I/O”) resources. It can also lead to frustration on the part of a user that requested file restoration, as the user may need a file to be restored very quickly in order to make a timely decision or meet an important deadline.
Use of conventional linear arrangement of version chains also presents a problem for data backup operations. Similar to the end-user requested backup file recovery process, when a collection of version chains representing delta compressed historical versions of successive data nightly and weekend backups of primary storage systems, a conventional procedure uses delta compressed version chains as a source for making magnetic tape backups that can be sent offsite in order to recover from a local site disaster. Backup administrators that employ a local delta compressed backup system may need to make one or more magnetic tapes from versions that are not the most recent version in the version chain. This can increase an amount of data storage that is needed to backup all that data.
Additionally, removal of one or more delta version files from a linear version chain can present an issue, as it can require delta-decompression of all of the more recent versions in order to remove the desired version as well as reconnection of two delta version neighbors that are adjacent to the removed version. Day to day management of delta version files within version chains can involve many (e.g., tens, hundreds, thousands, millions, etc.) purge operations, which can cause significant processing delays, consume large amounts of computing and/or I/O disk resources, prevent user access to data for a long time, as well as many other issues.