1. Field of the Invention
This invention relates in general to printing systems, and more particularly to a method and apparatus for managing complex presentation objects using globally-unique identifiers.
2. Description of Related Art
Print systems include presentation architectures, which are provided for representing documents in a data format that is independent of the methods utilized to capture or create those documents. One example of an exemplary presentation system, which will be described herein, is the AFP™ (Advanced Function Presentation) system developed by International Business Machines Corporation. However, those skilled in the art will recognize that the present invention is not meant to be limited to the AFP™ system, but rather the AFP™ system is presented herein as merely one example of a presentation system applicable to the principles of the present invention.
According to the AFP™ system, documents may contain combinations of text, image, graphics, and/or bar code objects in device and resolution independent formats. Documents may also contain and/or reference fonts, overlays, and other resource objects, which are required at presentation time to present the data properly. Additionally, documents may also contain resource objects, such as a document index and tagging elements supporting the search and navigation of document data for a variety of application purposes. In general, a presentation architecture for presenting documents in printed format employs a presentation data stream. To increase flexibility, this architecture can be further divided into a device-independent application data stream and a device-dependent printer data stream.
A data stream is a continuous ordered stream of data elements and objects that conform to a given formal definition. Application programs can generate data streams destined for a presentation device, archive library, or another application program. The Mixed Object Document Content Architecture (MO:DCA)™ developed by International Business Machines Corporation of Armonk, N.Y. defines a data stream, which may be utilized by applications to describe documents and object envelopes for document interchange and document exchange with other applications and application services. Interchange is the predictable interpretation of shared information in an environment where the characteristics of each process need not be known to all other processes. Exchange is the predictable interpretation of shared information by a family of system processes in an environment where the characteristics of each process must be known to all other processes.
A mixed object document is a collection of data objects that comprise the document's content and the resources and formatting specifications that dictate the processing functions to be performed on that content. The term “Mixed” in the Mixed Object Document Content Architecture (MO:DCA) refers to both the mixture of data objects and the mixture of document constructs that comprise the document's components. A Mixed Object Document Content Architecture (MO:DCA) document can contain a mixture of presentation objects types, which each have a unique processing requirement. The Mixed Object Document Content Architecture (MO:DCA) is designed to integrate the different data object types into documents that can be interchanged as a single data stream and provides the data stream structures needed to carry the data objects. The MO:DCA data stream also provides syntactic and semantic rules governing the use of objects to ensure different applications process objects in a consistent manner.
In its most complex form a Mixed Object Document Content Architecture (MO:DCA) document contains data and resource objects along with data structures which define the document's layout and composition features. This form is called a Mixed Object Document Content Architecture (MO:DCA) presentation document. Within such a data stream the Mixed Object Document Content Architecture (MO:DCA) components are defined with a syntax that consists of self-describing structures called structured fields. Structured fields are the main Mixed Object Document Content Architecture (MO:DCA) structures and are utilized to encode Mixed Object Document Content Architecture (MO:DCA) commands. A structured field starts with an introducer that uniquely identifies the command, provides a total length for the command, and specifies additional control information such as whether padding bytes are present. The introducer is then followed by data bytes. Data may be encoded within the structured field utilizing fixed parameters, repeating groups, keywords, and triplets. Fixed parameters have a meaning only in the context of the structure that includes them. Repeating groups are utilized to specify grouping of parameters that can appear multiple times. Keywords are self-identifying parameters that consist of a one byte unique keyword identifier followed by a one byte keyword value. Triplets are self-identifying parameters that contain a length field, a unique triplet identifier, and data bytes. Keywords and triplets have the same semantics wherever they are utilized. Together these structures define a syntax for Mixed Object Document Content Architecture (MO:DCA) data streams which provide for orderly parsing and flexible extendibility.
The document is the highest level within the Mixed Object Document Content Architecture (MO:DCA) data stream document component hierarchy. Documents may be constructed of pages, and the pages, which are at the intermediate level, may be made up of data objects. Data objects are at the lowest level and can be bar code objects, graphics objects, image objects and presentation text.
Multiple documents may be collected into a print file. A print file may optionally contain, at its beginning, an “inline” resource group that contains resource objects required for print. Alternatively, the resource objects may be stored in a resource library that is accessible to the print server, or they may be resident in the printer.
A Mixed Object Document Content Architecture (MO:DCA) document in its presentation form is a document which has been formatted and is intended for presentation, usually on a printer or a display device. A data stream containing a presentation document should produce the same document content in the same format on different printers or display devices, dependent on the capabilities of each of the printers or display devices. A presentation document can reference resources that are to be included as part of the document to be presented, which are not present within the document as transmitted within the MO:DCA data stream.
Pages within the Mixed Object Document Content Architecture (MO:DCA) are the level within the document component hierarchy which is utilized to print or display a document's content. Each page has associated environment information that specifies page size and that identifies resources required by the page. This information is carried in a MO:DCA structure called an Active Environment Group (AEG). Data objects contained within each page envelope in the data stream are presented when the page is presented. Each data object has associated environment information that directs the placement and orientation of the data on the page, and that identifies resources required by the object. This information is carried in a MO:DCA structure called an Object Environment Group (OEG).
Delimiters that identify the object type, such as graphics, image or text, bound objects in the data stream. In general, data objects consist of data to be presented and the directives required to present it. The content of each type of data object is defined by an object architecture that specifies presentation functions, which may be utilized within its coordinate space. All data objects function as equals within the Mixed Object Document Content Architecture (MO:DCA) data stream environment. Data objects are carried as separate entities in the Mixed Object Document Content Architecture (MO:DCA) data stream.
Resource objects are named objects or named collection of objects that can be referenced from within the document. In general, referenced resources can reside in an inline resource group that precedes the document in the MO:DCA data stream or in an external resource library and can be referenced multiple times. Resource objects may need to be utilized in numerous places within a document or within several documents.
An object container within the Mixed Object Document Content Architecture (MO:DCA) is an envelope for object data that is not necessarily defined by an International Business Machines Corporation presentation architecture and that might not define all required presentation parameters. The container consists of a mandatory Begin/End structured field pair, an optional Object Environment Group (OEG) and mandatory Object Container Data (OCD) structured fields. If an object is to be carried in Mixed Object Document Content Architecture (MO:DCA) resource groups and interchanged, it must, at a minimum, be enveloped by a Begin/End pair. The Object Classification triplet on the Begin structured field must specify the registered object identifier (OID) for the object data format, and the data must be partitioned into OCD structured fields.
A printer data stream within a presentation architecture is a device-dependant continuous ordered stream of data elements and objects conforming to a given format, which are destined for a presentation device. The Intelligent Printer Data Stream (IPDS)™ architecture developed by International Business Machines Corporation and disclosed within U.S. Pat. No. 4,651,278, which is incorporated herein by reference, defines the data stream utilized by print server programs and device drivers to manage all-points-addressable page printing on a full spectrum of devices from low-end workstation and local area network-attached printers to high-speed, high-volume page printers for production jobs, Print On Demand environments, shared printing, and mailroom applications. The same object content architectures carried in a MO:DCA data stream are carried in an IPDS data stream to be interpreted and presented by microcode executing in printer hardware. The IPDS architecture defines bi-directional command protocols for query, resource management, and error recovery. The IPDS architecture also provides interfaces for document finishing operations provided by pre-processing and post-processing devices attached to IPDS printers.
The IPDS architecture incorporates several important features. As noted above, since the IPDS architecture supports the same objects as those carried by the MO:DCA data stream, the IPDS architecture enables the output of multiple diverse applications to be merged at print time so that an integrated mixed-data page, including text, images, graphics, and bar code objects, results. The IPDS architecture transfers all data and commands through self-identifying structured fields that describe the presentation of the page and provide for dynamic management of resources, such as overlays, page segments and fonts as well as the comprehensive handling of exception conditions. Furthermore, the IPDS architecture provides an extensive acknowledgement protocol at the data stream level, which enables page synchronization of the host (e.g., print server) and printer processes, the exchange of query-reply information, and the return to the host of detailed exception information.
One of the major hurdles to overcome in high-speed color printing, e.g., around 100 pages per minute (ppm), is the large time overhead associated with downloading and processing large color images. For example, an 8×10 CMYK (Cyan, Magenta, Yellow and BlacK) color image, at 600 dots per inch (dpi), JPEG compressed with a compression ration of 10:1, still contains about 10 MB (megabytes) of data.
If the typical attachment bandwidth is 2.5 MB/sec between the printing system and the server containing the image, 4 seconds are required just to download the image from the server to the printing system. While page and resource buffering in the printer can save some of this time, it is clearly incompatible with a print window of 0.5 seconds/page (for a 120 ppm printer).
Resource objects such as overlays may be used to overcome some of this problem in certain circumstances. Overlays may be downloaded, cached, and reused each time the overlay is referenced for printing. However, cached resources are only available in the printer for the duration of the job and are normally deleted under control of the print server or if the printer is powered down or re-started.
Another solution is to download and raster image process (RIP) the complete print file into disk storage, and then print out of the disk storage. However, this method is not suitable for large files because it requires massive amounts of disk storage and incurs a huge download and RIP time prior to printing.
It can be seen that there is a need for a method and apparatus that enables downloaded objects to be reused multiple times by multiple documents and print servers without additional download time overhead.
It can also be seen that there is a need for a method and apparatus for uniquely identifying all downloaded objects to maintain object integrity across print jobs, print servers, etc.