High-speed digital printing presents unique requirements to data-processing equipment. For example, to operate a printing apparatus which is designed to output over 100 page-size images per minute, the ability to make the desired image data of a particular page available to the printing hardware requires very close tolerances in the management of the "overhead" when data is transferred from a memory and applied to the printing hardware. A typical letter-sized page image at 600 dpi resolution, in a format suitable to be submitted to printing hardware is typically of a size of about 4 MB for monochrome and 16 MB for color; when printing hardware demands the image data to print the particular page image, this image data must be accessed from memory within a time frame of approximately 300 milliseconds.
As is known in the art of digital printing, these large quantities of data must be processed in numerous sophisticated ways. For example, image data in a page description language (PDL), such as HP-PCL or PostScript.TM., must be decomposed into raw digital data, and this raw digital data may often have to be compressed and decompressed at least once before the data reaches the printing hardware. In addition, in a high-volume situation where hundreds of different pages are being printed in various jobs, the particular set of image data corresponding to a page to be printed at a given time-window must be carefully managed.
In digital printing, and particularly in network or color printing, the various specific tasks which must be performed by software and hardware in a system lead to a number of design trade-offs which must be made. One essential trade-off may occur between the requirements for a high or maximum throughput of the printing hardware (that is, having the printer hardware supplied with enough hardware-ready image data to maintain the hardware at maximum physical speed for the longest period of time) and minimizing the time until the first page emerges from the printer, which has been shown to be a major customer requirement. This trade-off centers on the fact that the step of "interpreting" original image data from a PDL to "raw" binary data which is directly operative of printing hardware not only takes a fairly significant amount of time, but also the time required may vary widely depending on the complexity of a particular page image being printed. If a system simply sent binary image data to the printing hardware as it became available from an interpreter, it is likely that the printing hardware would very often find itself waiting for the next page image to be output by the interpreter, thus wasting significant amounts of time.
At the same time, many printer architectures impose special requirements on incoming page image data. Some printing apparatus output pages in an "n to 1" order, meaning that the last page must be printed first so that the ultimate stack of output pages is in a correct order; in this case, it can be seen that an entire set of page images forming a document must be completely processed by the interpreter before any pages can be submitted to the printing hardware: this of course will cause an enormous delay in first page out time. However, if the printing apparatus is designed to output pages in "1 to n" order it would be still be desirable to have a number of page images immediately ready for submission to the printing hardware to minimize the time in which the printing hardware has to wait for the interpreter.
Because there is a distinction between the printing of images via use of image data directly (hardware interface), and the means of getting the image data to the printer (print controller), digital printers are not predisposed to operating on a job, document or page boundary. They primarily work on a page by page basis when printing. This implies that data need not be grouped according to the job or document boundaries when the print controller sends the image data to be printed by the digital printing hardware. Feasibly, a set of print image data pages could be printed transcending job or document boundaries.
"Middle-ware" refers to systems architecture and software which interfaces producers and consumers of data. One often recurring and unique method of organizing data is to describe the data within a hierarchical format. The middle-ware presented herein enables data producers to interface with data consumers for the specific case where those producers and consumers operate on the same set of hierarchically organized data, but do not necessarily work on the same level of data (i.e. job, document, or page boundaries) within the hierarchy. The middle-ware acts as an ideal consumer for the data producer by accepting the data at the level the producer most readily supplies it, and, at the same time, the middle-ware acts as an ideal producer to the data consumer supplying the consumer with data at the level the consumer most readily accepts it. Data supplied by the producer is kept in memory until the consumer needs it. Maximum data throughput is achieved because the middle-ware is always ready to accept data from the data producer, and the middle-ware is always ready to supply the data consumer with data as soon as the consumer requests it (both acceptance of data from the producer and supplying data to the consumer are contingent on whether or not there is data, of course).
The result is a middle-ware for interfacing data producers and consumers of hierarchically organized data within the digital printing framework, while managing such data via a method which maximizes throughput of the data. The key to data throughput maximization is achieved by simply interfacing to the producer as an ideal data consumer (according to the producer's perspective) and, at the same time, interfacing to the data consumer as the ideal data producer (according to the producer's perspective).
The data, as it is prepared for the consumer within this middle-ware, is grouped into what has been termed a burst. A burst consists of a set of data grouped according to the data consumption constraints of the data consumer and is not grouped according to the constraints of delivery of the data by the data producer. Such a middle-ware would be widely reusable and could serve as an interface between any combination of data producer(s) and consumer(s) (one-to-one, one-to-many, many-to-one, and many-to-many relationships are possible) and provides an optimal balance between data producer output and consumer data consumption.