This invention generally relates to printing devices with memory. More particularly, this invention relates to printing devices that compress rendered data of a to-be-printed page and decompress it before printing.
Page printers typically capture an entire page before any image is placed on paper. A typical page printer is a laser printer. A page printed by a laser printer may include one or more of the following elements: xe2x80x9ctextxe2x80x9d, xe2x80x9cgraphicsxe2x80x9d, xe2x80x9chalftonexe2x80x9d images, or xe2x80x9cnaturalxe2x80x9d images.
Data Types
xe2x80x9cTextxe2x80x9d typically consists of letters, numbers, words, sentences, and paragraphs. Text normally has a font associated with it. The symbols in text are typically represented by codes that direct the printer to generate a rendered version over a given area on a page.
xe2x80x9cGraphicsxe2x80x9d are typically computer-generated images. They usually have sharp edges and sharp color transitions. Graphics may have subtle shadings, but typically include blocks of solid colors. Line-art is a type of graphic consisting entirely of lines, without any shading.
A xe2x80x9chalftonexe2x80x9d image is typically a printed reproduction of a photograph, using evenly spaced spots of varying diameter to produce apparent shades of gray or color. The darker the shade at a particular point in the image, the larger the corresponding spot (i.e., cell) in the halftone. Newspapers typically print photographs using halftones. With a desktop laser printer, each halftone spot is represented by an area containing a collection of dots.
xe2x80x9cNaturalxe2x80x9d images are digitized images typically captured from the real world. These images are typically captured by a scanner, a digital camera, or frame-grabs of a video signal. They are often digitized representations of photographs. They may be a digitized representation of a document. Unlike halftone images with variable cell sizes, natural images use actual shades of gray and shades of color.
Rendering an Entire Page
In laser printers, either a host computer or the printer itself formats pages containing text, graphics, halftone images, and/or natural image. Since a laser printer""s print engine operates at a constant speed, new rendered (i.e., xe2x80x9crasterizedxe2x80x9d) data must be supplied to the print engine at a rate that keeps up with the engine""s operating speed.
Typically, a laser printer buffers a full raster bitmap of an entire page so that the print engine always has rendered data awaiting action. Alternatively, a laser printer may store only portions of a page and print them. While the printer is printing portions of a page, it is rendering the next portions of the page.
Mopying
It is desirable for a laser printer to store the rendered data of an entire page before actually printing. One advantage of storing the fully rendered page is the ability to efficiently print multiple original copies (xe2x80x9cmopyingxe2x80x9d).
A printer mopies by storing one copy of a page, but printing multiple copies of it. The printer does this without receiving additional copies of the page from the host computer. Dataquest, a San Jose, Calif.-based research group, estimates that fourty-three percent of laser-printer users are already producing multiple original prints with their printers. Printer manufacturers prefer to produce printers with mopying capability because it provides a cost-effective, efficient, and timesaving alternative to copying.
Color Conversion
Another advantage of storing the fully rendered page is efficient color conversion. The color standard for incoming data is RGB (Red-Green-Blue). RGB color data is typically represented by twenty-four bits consisting of three color components with eight bits devoted to each. Since each color is represented by one byte, each color may have 256 discrete levels. Mixing levels of RGB results in 16.7 million possible color combinations for each xe2x80x9cdotxe2x80x9d.
A xe2x80x9cdotxe2x80x9d is the smallest addressable and printable element. Herein, the terms xe2x80x9cdotxe2x80x9d and xe2x80x9cpixelxe2x80x9d are used interchangeable. Each dot has value that is associated with a color. The size of that value represents the number of possible colors of the dot. This is called xe2x80x9ccolor depth.xe2x80x9d Examples of color depth include:
monochrome (1 bit of information per dot)
grayscale (8 bits of information per dot)
color (RGB) (8 or 16 bits of information per dot)
true color (RGB) (24 or 32 bits of information per dot)
The standard for physically printing actual color on paper is not RGB. Rather, it is CMYK (Cyan-Magenta-Yellow-Black). CMYK is a color model in which all colors are described as a mixture of these four process colors. CMYK is the standard color model used in offset printing for full-color documents and in typical desktop color laser printers. Because such printing uses pigments (such as, ink or toner) of these four basic colors, it is often called four-color printing.
Most desktop color laser printers produce their best output when receiving RGB data, rather than CMYK data. The color laser printers include mature technology that automatically, effectively, and accurately converts RGB data to CMYK print results.
The rendered page for one of the color components is called a xe2x80x9ccolor plane.xe2x80x9d For each RGB color, it is desirable for a color laser printer to store the three color plane of each page. For example, a page where only the red component of the RGB data is rendered is the red color plane of that page.
Unlike color ink-jet printers, the print engine of a color laser printer cannot xe2x80x9cpausexe2x80x9d and wait for data while the printer is processing it. When this happens in a laser printer, a xe2x80x9claser underrunxe2x80x9d error is generated. To ensure that the color laser printer has fully processed to-be-printed color data, the printer converts the three color planes of the RGB data to CMYK data before sending the data to the print engine. By concurrently storing the rendered data of all three RGB color planes of a page, the printer quickly and efficiently converts RGB data into CMYK to send to the print engine for printing without pauses.
Printer Memory Requirements
Memory requirements of a laser printer increase as the dots-per-inch (dpi) resolution and the color depth increases. Black-and-white (bandw) laser printers typically have a one-bit color depth. Such printers from a few years ago had a resolution of three hundred (300) dpi. These printers needed approximately one megabyte (MB) of raster memory for each letter-sized (8.5xe2x80x3xc3x9711xe2x80x3) page. With a 600 dpi bandw printer having one-bit color depth, approximately 4 MB of memory is required. At one extreme, a color laser printer having a 1200 dpi resolution and a thirty-two (32) bit color depth requires approximately 540 MB of raster memory to store one entire letter-sized page.
It seems that each successive generation of color laser printers produce sharper and more colorful output. In a large part, this is a result of greater resolution and greater color depth. Therefore, there is an apparent need to have a large raster memory in a color laser printer.
In addition the above reasons, speed is another reason for more memory in a printer. A printer needs additional memory to print a series of pages as fast as possible. To avoid printer engine idle time and to run the print engine at its rated speed, printers need additional raster memory to rasterize and store successive pages. Without additional memory, composition of a subsequent page cannot begin until the present page has been printed.
Despite the technological reasons to maximize the raster memory on a color printer, manufacturers prefer to minimize the memory to remain cost competitive. Therefore, substantial effort is directed to reducing the amount of required memory in a laser printer. To reduce the amount of required memory in a laser printer, many conventional printers employ general-purpose data compression techniques.
Data Compression to Minimize Memory Requirements
Generally, data compression techniques encode a stream of digital data signals into compressed digital code signals and decode the compressed digital code signals back into the original data. Data compression refers to any process that attempts to convert data in a given format into an alternative format requiring less space than the original.
Generally, in order for data to be compressible the data must contain redundancy. Compression effectiveness is determined by how successfully the compression procedure uses the inherent redundancy in the original data to compress the data. In data containing text, redundancy occurs both in the non-uniform usage of individual symbols (e.g., digits, bytes, and characters) and in frequent reoccurrence of symbol sequences (e.g., commonly used words, blanks, and white space). In data contain graphics, redundancy occurs in blocks of colors, blocks of color patterns, and white spaces.
The objective of data compression techniques is to effect a savings in the amount of storage required to hold a given body of digital information. Data compression techniques are divided into two general types: xe2x80x9closslessxe2x80x9d and xe2x80x9clossy.xe2x80x9d
Lossless Data Compression
Using lossless techniques, compressed data may be re-expanded back into its original form without any loss of information. The decoded and original data must be identical and indistinguishable with respect to each other.
Lossless data compression techniques are often used with text because the loss or modification of any data is generally considered unacceptable. Otherwise, a letter xe2x80x9cAxe2x80x9d may turn into a letter xe2x80x9cjxe2x80x9d or the number xe2x80x9c2xe2x80x9d may change into a punctuation mark. Blank space may fill with apparently random characters.
Likewise, lossless data compression techniques are often used with graphics for similar reasons. However, instead of letters and numbers changing if a there were data loss during a conversion, graphics would have changes in its colors, lines, and dots. If a lossy technique was used, a straight line may become crooked and a solid color block may get speckles of other colors. Generally, for both text and graphics, lossless compression is used because humans will notice any difference between the original and the decompressed versions of the same data. The differences are so apparent that people may notice xe2x80x9cmistakesxe2x80x9d without a side-by-side comparison of the original and decompressed versions.
Examples of lossless data compression procedures are 1) the Huffman method, 2) the Tunstall method, and 3) the Lempel-Ziv (LZ) method. Each of these procedures effectively compresses/decompresses text-based data with no loss of data. In addition, each effectively compresses/decompresses graphics-based data with no loss of data. These procedures are well known by those of ordinary skill in the art.
Lossy Data Compression
However, other types of data do not need to use a compression technique that ensures that the decompressed and the original data be identical. Because the human eye is not sensitive to noise, some alteration or loss of information during the compression/decompression process is acceptable. This loss of information gives the xe2x80x9clossyxe2x80x9d data compression techniques their name.
Lossy data compression techniques are often used with natural images because the human eye does not notice the small differences between the original and the decompressed versions of the data. People may have difficulty noticing any difference even when comparing the original and decompressed versions side-by-side. However, some implementations of lossy compression techniques can generate artifacts. Artifacts are unwanted effects such as false color and blockiness. Herein, it will be assumed that when data is lossily compressed, such compression is done in a manner to minimize artifacts.
Lossy compression techniques have one significant advantage over lossless techniques. With specific types of data, lossy techniques typically compress data much better than lossless techniques. The effectiveness of a data compression technique is called xe2x80x9ccompression ratio.xe2x80x9d The compression ratio is the ratio of data size in uncompressed form divided by the size in compressed form.
An example of a lossy data compression procedure is the JPEG (Joint Photographic Experts Group) method. JPEG can reduce files sizes to about five percent of their original size. JPEG is particularly effective in compressing and decompressing natural-image data. JPEG is well known by those of ordinary skill in the art.
Halftone Images and Lossless Data Compression
Halftone images represent a special case. Since a halftone image is typically a photograph and includes apparent shades of gray or color, it is logical to compress it in the same manner a natural image rather than as text or graphics. Therefore, such a halftone mage seems to be best compressed by a lossy data-compression engine. However, that is not so. Halftone images are best compressed using a lossless engine.
Typically, halftone images include repetitive groupings of pixels (i.e., halftone cells). Since lossless engines work best on data including redundancies, the repetitive nature of the halftone image makes them better to compress using a lossless engine over a lossy engine. In addition, a loss of data in an uncompressed version of a halftone image may produce noticeable blockiness.
Selection of the Best Data Compression Technique
Data compression allows a laser printer to have reduced memory requirement. How reduced that requirement is depends upon the data compression technique or techniques used.
Some conventional laser printers use only one data compression technique. Nearly universally, this is a lossless compression technique because it works on all data types without any fear of data loss. However, lossless techniques generally do a poor job compressing natural images. In other words, lossless techniques have a low compression ratio for natural images.
Some conventional printers compromise and use both lossless and lossy compression techniques. This compromise offers the possibility of maximum data compression for a page; thereby, minimizing the memory requirements. However, such a compromise introduces difficulties in determining when and how to select a particular compression engine.
Two-Pass Page Compression
One conventional solution is a xe2x80x9ctwo-passxe2x80x9d page compression. A first pass over a page is made using a lossless compressor. If the resulting compressed data does not meet some determined size expectations, then a new pass is made over the uncompressed data using a lossy compressor.
This method may cause the same data to be compressed twice (once losslessly and one lossily). This method may miss some potential memory savings because the resulting size was under the expectations. Thus, potential memory savings are lost. This technique can be processor intensive because it may compress the same data twice.
Page Segmentation
Another conventional solution is to decompose a page into sections (i.e., segments). Each section contains primarily either 1) text, graphics or halftone images; or 2) natural images. Once segmented, each section is compressed using the technique that is best suited for its data type. Sections are identified by an analysis of each pixel on a page.
The segments of a page may be blocks or areas anywhere on a page. In some implementations, the segments may resemble a jigsaw puzzle. In other implementations, the segments may resemble rectangular blocks of various sizes.
Although this page segmentation technique is an accurate way to select which compressor to use, it has significant problems. Specifically, the problems include 1) improper segment boundary identification; 2) improper segment data-type identification, 3) lack of speed, and 4) improper piecing back together the xe2x80x9cjigsawxe2x80x9d puzzle of segments.
It is difficult to correctly identify segments of a page. The boundaries between differing sections are often incorrectly drawn. Also, the data type of a section is often misidentified. Therefore, the resulting segmentation often produces in less than ideal results.
Page decomposition and composition process is slow, memory intensive, and processor intensive. Typically, the printer""s central processor examines each dot on a page. Based upon each dot""s surrounding dots, it estimates that a dot is either part of text, graphics, halftone image, or a natural image. This means that multiple lines of dots must be buffered so that this xe2x80x9cwindowingxe2x80x9d operation can be performed for each dot. Comparing each dot to all of its surrounding dots is a slow process because it requires eight comparisons per dot. For a letter-sized page at 600 dpi, that requires approximately two hundred sixty million (260,000,000) comparisons. After the comparisons, additional calculations are performed before each dot can be classified.
After a page is segmented and each segment is compressed according to its identified data type, the segments are uncompressed and pieced back together just before printing the page. During this reconstruction of segments, the xe2x80x9cjigsaw piecesxe2x80x9d are sometimes improperly put back together. Like the segmentation, this process is also slow, memory intensive, and processor intensive.
Page segmentation cannot be done concurrently with data compression and data decompression. With page segmentation, it is not possible to compress a portion a page while that portion is being analyzed. With page segmentation, it is not possible to send uncompress data to the print engine immediately upon decompression. Page segmentation requires additional processing to piece uncompressed segments back together.
To reduce the memory requirements of a printer, it is desirable to compress rendered data of a to-be-printed page. Specific types of data of a to-be-printed page are best compressed using either lossless or lossy data-compression engines. xe2x80x9cText,xe2x80x9d xe2x80x9cgraphics,xe2x80x9d and xe2x80x9chalftonexe2x80x9d images are typically best compressed using a lossless engine. xe2x80x9cNaturalxe2x80x9d images (such as photographs) are typically best compressed using a lossy engine.
To select the best compression engine, the fast page analyzer divides the page into multiple horizontal strips of multiple horizontal lines. Each line contains a series of pixels. The fast page analyzer concurrently compresses and analyzes each strip. The analysis of one strip determines which compression engine is used to compress the next strip. In effect, each strip predicts the proper engine to compress the next strip.
When printing a page that has been compressed in that manner, the printer will decompress each strip using its associated decompressor. The pixels and lines are delivered to a print engine as they are decompressed and in the order that the print engine prints the image onto the paper.