The present invention relates to digital image printing. It finds particular application in conjunction with improving productivity in production printing of variable data documents and will be described with particular reference thereto. It will be appreciated, however, that the invention is also amenable to other like applications.
In a variable data printing application, every printed document may be unique. However, some elements are typically common to more than one (1) of the pages. An example of a variable data application is a PowerPoint presentation, which includes at least one (1) complex graphic that appears on more than one (1) of the slides (pages). The variable data for each of the respective slides in the presentation may include, for example, the slide number and the non-repeating content (e.g., a complex graphic that only appears on one (1) of the slides). Elements that are repeated within the presentation (e.g., “master” content) may include a corporate logo and/or other background information common to all of the slides. Caching the repeated elements (i.e., the “master” content) offers efficiency within a printing system, especially if the master content includes complex graphics or scanned images (which are relatively more expensive to construct during a raster image process (“RIP”) or at final assembly time).
A conventional printing apparatus receives input data describing elements within a visual image on, for example, a page. The elements are rasterized according to a RIP for creating a printed output. If the page includes multiple graphical elements, the amount of data that must be rasterized tends to be very large. Therefore, a memory device (e.g., a cache) within a printing device is allocated as an intermediate buffer for temporarily memorizing received input data.
In most current RIP systems, a bottleneck is encountered when rendering (processing) and, in particular, scaling and/or rotating, images. Color correction may also play a significant role in slowing down the processing of images. The time for rasterizing pages including simple text and graphics is dominated by fonts and/or complex graphics not already in the cache. A font is unique if the combination of the font name, style, and transformation is unique. An image is unique if the combination of the file location (assuming file contents remain fixed), scale, and rotation portions of the transformation are unique. A piece of complex graphics is uniquely identified by a corresponding sequence of PostScript instructions (except when the set of instructions contains conditions that cannot be resolved in early binding). Unless explicitly identified, complex graphics that repeatedly occur are uncommon and hard to recognize.
Standard cache management strategies (e.g., Least Recently Used (LRU)) are based on heuristic means of predicting, on average, which cache objects are least likely to be needed or, alternatively, if they are needed, which cache objects will be needed last. The need for heuristics is based on the fact that a computer program's resource needs (typically its needs for specific pages of memory) for applications unlike the present variable data printing application cannot be predicted without essentially executing a program. Some small amount of look-ahead may be performed, especially in straight-line code. However, in practice, very little information about future requests is available. Importantly, a significant amount of information about future requests is available in variable data applications. However, conventional cache management strategies are not capable of benefiting from this look-ahead data.
Conventionally, caches used within variable data systems rely on heuristics (probability of future need) or user supplied information for deciding whether to cache information. No conventional cache implements a system in which information that is constant (repeated) throughout the presentation (e.g., a corporate logo and other slide background information) is identified and pre-rasterized so that it is rasterized in advance of its first use. Therefore, conventional cache managers do not pre-fetch resources (e.g., fonts, transformed images, etc.) into the cache or have a good mechanism for determining what resources to pre-fetch. Consequently, the time for processing pages requiring new resources is not optimized. Also, the efficiency and throughput for a corresponding printing system is reduced.
Furthermore, although the concept of a cache has been used for speeding up serial processing of document data, parallel processing has not been used utilized by cache managers within a variable data system.
The present invention provides a new and improved apparatus and method which overcomes the above-referenced problems and others.