The image processing pipeline of a printer performs a number of operations upon print data in preparation for printing. These operations include, for example: print data compression, print data decompression, color space conversion, expansion, halftoning, clipping, scaling, rotation and the like. The type of operation performed and the specific order in which the operations will be performed can vary depending upon the type of print data which enters the pipeline, the capabilities of the print engine, and the memory available in the printer. The types of print data which may enter the pipeline include: text, line art, images, and graphics.
In conventional pipeline implementations, the various printing operations are performed by a microprocessor under the control of firmware. Depending upon the type of print data entering the pipeline and the operations necessary to process the print data, a number of possible firmware routines are executed to complete the print data processing operations. Alternatively, some operations may be embedded in an Application Specific Integrated Circuit (ASIC). In any case, the image processing pipeline is often referred to as the "image processor" for the printer, whether it is designed as a large "single" function call entity or as multiple functional entities.
As printers increase in density of dot placement (dots per inch), add gray scale capability (using a set of bits per pixel to define a gray scale level), and include color printing capability (requiring additional bits per pixel over monochrome printing), the time required for the pipeline to process the print data becomes substantial. For example, in color printing the memory required to store the data used to print a page can reach thirty two times or more the memory required for a monochrome printer of the same resolution. To fully utilize the printing speed capabilities of the print engine, the pipeline must have the capability to process print data sufficiently fast to supply a continuous stream of print data to the print engine, thereby allowing the print engine to continuously print throughout the print job.
As previously mentioned, conventional data pipelines have been implemented using general purpose microprocessors. Although microprocessors have the versatility to be programmed to perform the operations of the data pipeline, the amount of cache memory associated with any given microprocessor generally directly increases the speed for performing these operations. In other words, the more the available cache, the better potential performance throughput. However, typically, microprocessors with more cache are more costly than those with less cache. Therefore, in efforts to cut costs, a smaller cache in a microprocessor is often the forced result for a low cost printer, but, generally, some cache is always better than none.
The microprocessor's cache is a specific area of memory generally used for quick access needs. A typical use is to store a scan line of print data for image processing operations. A scan line of print data is a one dimensional array of pixel data, and may include up to as much pixel data as spans across a sheet of paper--depending upon the object to be imaged in the print data. For example, a "longrule" is an object which extends across an entire page width, typically about 4500 pixels (or about 1100 words, assuming four pixels per word and eight bit pixels). Unfortunately, however, a smaller cache doesn't always hold an entire scan line of print data--depending upon the cache size and the imaging operations being performed. This is undesirable because most staged image processors perform multiple passes on a scan line. The problem is that if an object's scan line is too large to fit in the cache, as each stage of the image processor moves along the scan line (executing operations on the print data objects), the cache must throw out the data least recently used (i.e., previous data from the scan line) to enable loading of more current data. Similarly, each additional stage (image processing operation) repeats the same pattern until all stages have completed their formatting for the full scan line of data. As such, when a long scan line of data is conventionally image processed in a smaller cache memory, the overall efficiency and speed of the image processor is detrimentally affected due to the inherent cache thrashing (continuous reloading of data).
Accordingly, an object of the present invention is to provide an improved image processing mechanism and method, especially for a limited cache memory environment.