The present invention generally relates to the storage, retrieval and annotation of digital documents, and more particularly to a portable object-oriented data structure and method for storing a static digital document so that it can be rendered by a printer or viewer in a guaranteed layout, and so that it can store user annotations, navigational information, and the like.
In a computing system having a graphical operating environment, an operating program interface typically includes a graphics display system having a library of graphics elements that provide the programmatic capability to draw to a surface. For example, in the WINDOWS(copyright) operating system, the graphics device interface (GDI) is used to draw to surfaces such as a display or printer. The graphics library incorporated in the operating system provides the capability to display text, circles, lines, squares, and many other graphics elements.
Conventionally, a series of graphical commands or resources from a graphics library can be recorded for later replay. For example, if an application calls a DrawCircle primitive command followed by a DrawRectangle primitive command, those two directives and their parameters can be recorded to a file, then later rendered. The format for storing GDI graphical commands in the WINDOWS(copyright) operating system is the extended metafile format (EMF). This known method and data structure permits the rendering of a page to be captured and stored electronically in graphics format, rather than in image format, such as a bitmap format or tagged image file format (TIFF). The captured commands can then be rendered to a printer, or displayed on a screen.
Although this known method and data structure provides a useful means of capturing a digital page for later replay, numerous limitations exist which restrict the use of the format. For example, because a single EMF file only holds one page, there is no simple means for capturing and storing an entire document or group of documents in a single EMF file. In addition, the format does not provide any means for associating the captured information with user annotations or navigational information such as hyperlinks.
In addition to the graphics display system employed in the operating system of a conventional graphical operating environment, known applications have been specifically developed for drawing graphics information to surfaces such as a display or printer. The graphics libraries incorporated in these applications differ from the graphical commands and resources native to the operating system, but seek to obtain the same goal of capturing and storing text, circles, lines, squares, and other graphics elements in a portable format for later display or printing.
An example of a known application used for rendering static graphical information is Adobe Systems"" ACROBAT program, which converts a fully formatted document into a portable document format (PDF) file that can be viewed on several different platforms. The ACROBAT program uses a graphics library based on the POSTSCRIPT graphics language from Adobe Systems, and a viewer program is provided to view PDF files. Additional tools such as the DISTILLER, EXCHANGE, and PDF WRITER programs, all from Adobe Systems, are available for creating PDF files and for adding hyperlinks, annotations and other information to such files.
Although these known applications provide methods and data structures for capturing digital documents, certain drawbacks exist that prevent such systems from having universal application in various environments. For example, because the known systems incorporate special graphics libraries separate from the graphical commands and resources native to the operating system, special programs must be obtained to make, view and annotate documents to be rendered. In addition, if a portable format file is to be printed by a device that does not support the special graphics commands or resources, the operating system converts the digital document to include its native graphical commands, such as GDI graphical commands, for printing. Thus, even though the document has already been converted once into the portable file format, it must be converted again for printing. This can result in variations in the layout of the printed document.
Another problem encountered in the use of known document rendering systems arises when an attempt is made to render documents transmitted over the Internet or other network. Specifically, when a relatively large file is transmitted, the time required to receive the document can be significant. Because it may be necessary for the entire contents of the file to be received before the document can be displayed or printed, e.g., where random access of document data is not supported, a user must wait while the entire file is received. An attempted solution to this problem is to segment the file into two parts, one including the information necessary to render the first page or two of the document, and the rest including the remainder of the document. Thus, once the first part of the file is received, the first page or two can be displayed or printed. However, the user must still wait in order to view everything following the first page or two, which can be significant for long documents.
In accordance with various objects of the invention evident to a person of ordinary skill in the relevant art from the following description of the invention, a method is provided for storing a digital document. The method includes the steps of representing each page with at least one graphics object, creating a page object for each page that includes a reference to the at least one graphics object for that page, and creating a page list object including a list of references to the page objects for the document. The method further includes the step of creating a document root object that includes a reference to the page list object.
By providing a method in accordance with the present invention, several advantages are realized. For example, by practicing the inventive method, it is possible to capture and store a multiple page document, or a multiple document job, electronically in graphics format, enabling the document to be rendered to a screen or printer in a predictable and consistent layout that is device independent.
In accordance with one aspect of the invention, a computer-readable medium has stored thereon a data structure for storing a digital document having at least one page. The data structure includes at least one graphics object that represents all or a part of a page of the document. The data structure further includes a page object for each page of the document, wherein each page object includes a reference to the at least one graphics object for that page. A document root object is provided that includes a list of the page objects for the document, and a job object includes a list of all the document root objects for the documents to be rendered by the data structure. The data structure also includes an index object that identifies the location of each of the other objects in the data structure.
By creating each of these objects in the data structure for a document to be stored for subsequent rendering, numerous advantages result. For example, by providing an object-oriented data structure in which one of the objects is an index object referencing all the other objects, it is possible to store any object in the data structure in multiple, discontiguous segments. By allowing segmentable objects, a file can be streamed more effectively since unimportant bytes can be placed at the end of the file. Further, streaming is easier since any object can be appended to by appending the new bytes at the end of the file, rather than rearranging all the bytes of the file in order to keep the new and old bytes of the modified object contiguous.