Embodiments of the invention relate generally to archiving data on data storage media, and more particularly, to maximizing the capacity of archival data tapes using graph representation of the data.
Data are typically stored in magnetic storage tapes sequentially, i.e., one file after another. If the files to be stored are in a de-duplicated format, where duplicate portions of the files have been removed, then the de-duplicated files may need to be restored to their original duplicated format before they are written to tapes. Storing the duplicate file portions requires additional space in the data storage tapes and takes more processing time.