Digital preservation (“preservation”) refers to the ability to sustain the understandability or usability of digital content for future use regardless of changes in the applied technology or the initial intended use.
A data storage system that is implemented for preservation (i.e., “a preservation system”) typically generates preservation objects associated with digital content. A preservation object comprises content data and one or more metadata. The content data includes the actual data to be preserved and each metadata includes the information used for understanding or utilizing the content data or other metadata in the preservation object.
In existing preservation systems, a user or a process external to the preservation system is responsible for generating metadata (e.g., in a manual fashion). Such preservation systems use ad-hoc methods to search for metadata relevant to a particular preservation target. The quality and amount of the metadata are dependent on a person's ability to search for the metadata, as well as the availability of resources to invest in implementing a robust preservation scheme.