In recent years, a number of computerized applications for storing, retrieving, and displaying electronic content have come into wide use. With the advent of object-oriented and object-based paradigms throughout the software industry, the electronic content processed by such programs has been increasingly conceived in terms of a collection of separate objects, each object representing a particular content element, and the various objects structured to form the overall electronic content by being interconnected through a graph of linking data.
One example of electronic content conceived of as being composed of content elements encapsulated or described by objects is the ADOBE® Portable Document Format (PDF). In a PDF document, various content elements, such as pages, fonts, colour descriptors, graphical elements, and so forth are represented by discrete objects within the document.
Typically, there will be a close relationship between the structure of a file containing the electronic content and the internal, in-memory representation of the electronic content used by an application for rendering or editing the content. For example, a portion of a file describing a particular content element typically corresponds to an object in the internal data structure representing the content, and references within the portion of the file to other portions of the file describing other content elements will typically be mirrored by references from the object in the internal data structure to corresponding other objects in the internal data structure.
Because of this close correspondence, the program code needed to read a file into memory and/or write (or serialize) an in memory representation of electronic content to a file is typically fairly transparent, when the file is to be formatted according to this “native file format.” However, if it is found to be desirable to provide a second file structure for storing the content, which might be termed a “non-native file format”, particularly if that second file structure is quite different in format from the in-memory representation of the content or its representation in the native format, it is often a large and error-prone programming chore to add the program code to support the non-native file format to the application.
This difficulty is exacerbated by the frequent need to require separately-maintained software modules within the application dedicated to reading in (de-serializing) an electronic content from a file structured according to the non-native format and writing out (serializing) an electronic content to a file structured according to the non-native format.
Similar difficulties arise when it is desired to produce or translate, more or less directly from a file formatted according to the native file format and containing electronic content, a version of the same electronic content in a non-native file format, or vice-versa, especially if the source file includes data that is not part of the specification of the file format of the source file as specified by the format's designers, but rather has been added by other users of the file structure, but which nonetheless ought to be retained in the file into which the file is translated.
Finally, the need to maintain separate pieces of source code for translating from a native file format into a non-native file format, and for serializing an in-memory data structure to a file formatted according to the non-native file format (and perhaps these operations in reverse), can lead to errors and other maintenance issues.