The present invention relates to data compression and more particularly to a method and apparatus for compressing raster data
As monochrome laser printer technologies improve, printers can create pages with higher image resolution. The amount of raster data needed to be sent to a printer to print an image also increases, adversely affecting printer performance. When an image is scaled and dithered before it is sent to a printer, an 8 by 10 inch raster image requires 0.9 Mbytes of data for a 300 DPI device, 3.6 Mbytes for a 600 DPI device, and 14.4 Mbytes of data for a 1200 DPI device.
Some printers and drivers, such as monochrome Hewlett-Packard LaserJets(copyright) printers, are capable of accepting either color raster data or dithered monochrome raster data to print an image. If color data is sent, the printer performs scaling and dithering and produces the same image that would have been produced if the driver scaled and dithered. The choice to send color data verses dithered monochrome data is based on reducing the amount of I/O needed to print the image. Color images have more data per pixel, but scaled and dithered monochrome images are usually higher resolution, in this case 600 or 1200 DPI. If a source color image is of low resolution, or is significantly scaled up, then there is an I/O advantage to send a color or grayscale version of the source image and let the printer perform scaling and dithering. In practice, the source color image is usually converted to an 8 bit per pixel grayscale image by the driver. If the source color image is of high resolution or if it is not significantly scaled, then there will be less I/O if the printer driver performs dithering and scaling, and sends the printer a monochrome image that is ready to print. Some laser printers do not have the capability to accept color or grayscale data, so scaling and dithering is always performed on the host by the driver.
There are several compression options available to compress source images or dithered monochrome images. Most are variants of tagged image file format (TIFF revision 4) and delta row compression. The available versions of delta row compression are not effective for dithered monochrome images because each row is made different from the previous row by virtue of the dither matrix. The run length encoded TIFF compression methods are the best alternative currently available, and they work best when the dithered image is significantly scaled up or when the resulting image has a lot of area that is of the same color, such as business graphics and charts.
Scaling an image affects the compressibility of an image because one source pixel produces many similar output pixels, and the compressibility using a TIFF scheme relies on the similarity of adjacent pixels. It should also be noted that scaling is not done just to change the physical dimensions of an image, scaling is also used to convert image resolution. For example, if a source image has a resolution of 72 pixels per inch and has a dimension of 5 inches by 5 inches, it will contain 129,600 pixels. When the image is scaled and dithered for a 1200 DPI monochrome printer, and printed as a 5 inch by 5 inch image, it will be scaled to 36,000,000 pixels.
The most problematic raster images printed on monochrome laser printers are high resolution images similar to scanned photographs. What makes these images difficult to compress is the high frequency components. Experimentation has shown that low to moderate resolution scanned photos, business graphics, line art, and small high resolution images do not pose much of a performance challenge to LaserJet(copyright) printers, Large, high resolution images that resemble photographs offer a considerable challenge due to the amount of I/O sent to the printer. Sources of high resolution raster images are becoming more prevalent with the increased use of the internet, availability of inexpensive scanners, digital cameras, video capture computer accessories, and photo scanning services. This present invention addresses the performance challenge posed by high resolution photographic like raster images.
Microsoft Windows(copyright) (Windows is a registered trademark of Microsoft Corporation) defines a set of raster operations (ROPs) that are used to combine raster images. Hewlett-Packard LaserJet(copyright) printers are capable of performing ROPs even if the image is transformed to monochrome data by scaling and dithering on the host computer prior to being sent to the printer. To be ROP compatible, all transformation operations from a source DIB to a destination monochrome bitmap must be deterministic on a pixel by pixel basis so that exact results are obtained any time a source pixel is transformed. Consider an example where an application program uses an XOR (exclusive or) ROP to create a white window in the center of a raster image. First, the application program sends the entire color image to the driver where the color data is scaled and dithered into monochrome data. This dithered image is sent to the printer and no record of it or the source color image remains available to the driver. Next, the application selects a subset of the same image and sends it to the printer. The driver dithers and scales this subset image. For the XOR ROP to be successful, each pixel of dithered data from the image subset must exactly match the pixels of the original image within the bounds of the image subset, so each source image pixel must undergo the exact same scaling and dithering to produce a dithered raster segment where each destination dot of the image subset exactly overlays the destination dots of the original image. xe2x80x9cNeighborhoodxe2x80x9d operations where the value of a destination pixel may depend on the value of a neighboring pixel will not be ROP compatible if the neighboring pixels are not available to be used in the subsequent ROP.
Examples of other transformations that use neighborhood operations are error diffusion dithering, blue noise dithering, and bi-linear scaling. Lossy compression techniques must also be excluded if we wish to retain ROP compatibility. Although ROPs are defined for raster data, they are seldom used by application programs. In the above example, the white rectangle could have been accomplished by sending a rectangle of white raster with an opaque drawing mode.
Historically, ROPs have provided Windows with fast, easy ways to manipulate raster graphics on a display screen. The Windows(copyright) GUI is successful, in part, because it provides a means to write to any device using the same set of graphical commands (GDI). Every Windows(copyright) compatible raster device supports ROPs either directly or by letting GDI provide the support. Regardless of the frequency of use, since ROPs are defined and available for application programs to use, a functionally complete printer driver must support ROPs.
Another example that illustrates the need to avoid neighborhood operations and remain deterministic is the fact that some application programs send image data in segments. An image may be split horizontally or vertically into segments as small as one pixel. One reason that application programs may fragment raster images is because Windows(copyright) does not yet support arbitrary raster rotation, so applications rotate images prior to sending them to the driver. Some applications process the rotation in segments, and send each segment to the driver as it is completed, letting the printing process assemble the final image from the segments. Single pixel calls are sometimes found at the tip and the base of a rotated image. These image segments may be scaled and dithered by the printer driver before being sent to the printer, and each image segment must seamlessly join adjacent segments to make the final printed image. Neighborhood operations may cause undesirable artifacts to appear at the borders of the image segments.
This present invention uses clustered dot ordered dithering to be deterministic and supports ROPs as well as seamless image construction. See, Ulichney, R., Digital Halftoning. MIT Press, Cambridge, Mass., 1987. This determinism refers to the fact that a given source pixel will create a specific group of destination pixels as a function of scaling and dithering, and those destination pixel values are solely determined by the value of the single source pixel.
In order to accomplish the present invention there is provided a method for compressing raster images. The method is implemented by collating sub pieces of the image in to at least one sub-string. Each sub-string is indexed to encode any predicted runs and literal runs. The resulting index sub-strings are compressed using a lossless compression method.
The step of collating may also determine, for each line of the raster image, if the raster image represents text or a halftoned image. If the raster image represent text then the sub pieces are sequential placed into one sub-string Alternatively, if the raster image represents the halftoned image, then the sub-pieces are interleaved into multiple sub-strings.
There is also provided a method of decompressing compressed data into a raster image. The compressed data is decompressed using a lossless decompression method, into a plurality of index sub-strings. The plurality of index sub-strings are decoded by using an index table to convert predicted runs and literal runs into data in a sub-string. The sub-strings are interleaved to recreate the raster data.