The present invention relates to computer file formats and methods and apparatus for describing documents and expressing document structure.
Computer content authoring application programs, such as word processing applications and spreadsheet applications, produce content having varying levels of structure. Furthermore, different authoring applications store content and structural information using different file formats. Typically, the file format used by one authoring application is not understood by other authoring applications. As a result, the content and structural information stored in a file is typically accessible only to the authoring application that produced the file and to other authoring applications which have been specifically designed to understand the file format of the file. Authoring applications which have not been specifically designed to understand the file format of the file may be able to retrieve the content stored in the file, but typically will not be able to retrieve the structural information stored in the file.
In networked environments where there is a large amount of communication among applications and among computers, requiring each application to understand the file formats used by other applications is becoming increasingly unwieldy and inefficient. One method which has been used to address this problem is the use of a “universal” file format which attempts to encapsulate all possible content and structural information that can be generated by any application program. Files stored in such a format, however, tend to be large, and cannot be guaranteed to be capable of encapsulating information generated by future versions of application programs.