Presently, digital halftoning is performed by general purpose digital computers or by special purpose digital hardware. Digital halftoning is used, for example, in typesetting devices to prepare plates for printing presses, in xerographic and inkjet desk top printers, in computer CRT displays, and in other imaging devices. Some terms relevant to digital halftoning will be defined as follows:
output device--The printer, image setter, display, or other device on which halftone output is produced.
picture element--The small, colored or monochrome dot which is the atomic unit of visual output of the output device.
output frame--The digital representation of the collection of pixels which the output device images onto the display or onto an output medium such as paper or film. The output frame is not necessarily paginated (i.e., it could be a paper scroll or an internal digital representation of some sort).
pixel--The digital representation of the picture element.
memory--The storage in a computer or other digital device, such as RAM, ROM, disk, or other kinds of volatile or non-volatile storage.
frame buffer--A buffer of memory which holds an output frame.
deliver frame--The process by which the output frame is sent to the output device.
image--A set of digital values representing a picture. The digital values will be called `samples` to distinguish them from the `pixels` in an output frame. Each sample identifies the color or gray scale for a small area of the image. The samples are usually in a raster (though other arrangements such as the Peano scan are possible).
raster--The arrangement of the pixels in an output frame or the samples in an image into a rectangular array of scan lines.
The range of colors for each picture element is often limited by the output device technology to a number of levels far smaller than the range of the human visual system. On many laser printers, ink jet printers, and type setters, for example, a picture element can have only two colors, that of the toner, ink, or other colorant and that of the underlying paper or other surface. Furthermore, when colorant is produced in a picture element, its size is limited to a single size or to a small number of sizes. In the examples that follow, it will be assumed for definiteness that the colorant is black, and that the underlying surface is white. Therefore, the intermediate colors will be labeled `grays`. The principles described apply to other color combinations.
When picture elements are small, the visual effect of gray can be achieved by controlling the size, number, thickness, and/or transparency (to name several of the possible mechanisms) of the picture elements. If the pattern of picture elements is fine enough, the human viewer will perceive a uniform gray rather than a pattern of individual dots. The shade of gray may be controlled by selecting the percentage of black and white in the dot pattern. The simulation of various shades of gray by aggregating a number of colored dots is called halftoning.
For a bi-valued output device, such as most laser and inkjet printers, the color perceived by a person viewing the output is determined by the fraction of the picture elements in which the colorant is produced. Multi-valued output devices can produce picture elements of different sizes, thicknesses, densities, or transparencies. For these devices, the perceived color is determined by the number of picture elements which have colorant and by the amount of colorant produced in each picture element. However, unless the amount of colorant in each individual picture element can be controlled sufficiently finely to approximate, within the limits of the output device, the full color range of a human viewer, halftoning techniques are still needed to achieve the appearance of intermediate colors. An advantage of a multi-valued output device is that it can achieve the desired gradations of grays in a smaller spatial area than a bi-valued output device.
In addition, some output devices can use two or more different colorants. For example, if one colorant is black and another is gray, then light grays can be produced by halftoning between the background white and the gray colorant, and dark colors can be produced by halftoning between the gray and black colorants.
Halftoning techniques can also be applied on an output device with two or more colorants to produce multi-colored or multi-toned output. Most issues for multi-colored output devices can be explained for a monochrome, bi-valued output device. Some additional issues, such as Moire patterns, must be managed with additional considerations for multi-colored output, but the halftoning techniques used for multi-colored output are essentially identical to those for monochrome output devices. The techniques for halftoning in monochrome devices discussed herein may be extended to multi-valued picture elements, to multi-toned colorants, to multi-colored output devices, and to other kinds of output devices according to practices known in the art.
In a high-level description of how the picture should look, three kinds of content that often occur are graphics (e.g., fills and strokes), glyphs (e.g., text), and images. To convert the high-level description of a graphic or glyph into pixels in the output frame, a mathematical outline (or closed curve) and a gray level are determined for the graphic or glyph. Then the halftone pattern corresponding to the gray level is copied into the pixels which fall within the outline. To convert an image into pixels in the output frame, the sample set of the image is clipped in such a way that a reference image sample value, or `reference gray level`, can be associated with each pixel in the output frame. The final step in this process is to apply a halftoning technique to determine the value of each pixel as a function of its reference gray level and its position in the output frame.
There are a variety of techniques to select a reference gray level for the halftoning process. When a pixel occupies an area in the output frame which has more than one image sample over it, it is usually faster to choose one of these overlying image samples as the reference gray level. This technique is called `point sampling`. Another technique that is often visually superior is to blend or average the different overlying image samples. Two examples of this alternative technique are `linear interpolation` and `bi-cubic interpolation`. A variety of other techniques to determine a reference gray level are known in the art.
If different objects in an output frame are positioned next to each other, the visual result will often be superior if a uniform halftoning method is used in all of the objects. Otherwise, the differences in the halftoning process may generate visible defects at the boundaries between the objects. To avoid these defects, particularly if the touching objects are both images or are both graphic objects, the same halftoning method may be used for any object which might be placed at a particular position of the output frame. Some experts choose to differentiate the halftoning method used for images from the halftoning method used for graphic objects because the objectives may be different. For example, the most important objective for an image may be accurate representation of the gray level, whereas the most important objective for a graphic object might be sharply delineating edges.
One digital halftoning technique is threshold halftoning. In threshold halftoning, a reference gray level from an image is determined at each pixel of the output frame. The reference gray level is compared to a reference threshold for that pixel, and a value is assigned to the pixel according to the results of the comparison. The thresholds for the pixel are usually specified by either a spot function or a threshold array, although other means of specification are also possible. With a spot function, the threshold may be specified as a mathematical function of the device position. With a threshold array, the threshold is obtained from an array. A threshold array typically represents a rectangular shape, but a parallelogram or other shape can also be used. In principle, the threshold array can be as large as the output frame itself. However, in most output devices the cost of the memory to hold such a large array would be prohibitive. Therefore, a smaller array is usually treated as a `tile` and is replicated to cover the output frame so that each pixel corresponds to one element of the threshold array.
Although there are different techniques to specify thresholds, for the purpose of definiteness the discussion herein assumes a threshold array is used to determine the thresholds. The following is an example of a threshold array:
108 044 036 100 140 172 204 156 PA1 052 004 012 092 220 252 244 188 PA1 060 020 028 084 180 228 236 212 PA1 116 068 076 124 148 196 164 132 PA1 144 176 208 160 112 048 040 104 PA1 224 255 248 192 056 008 016 096 PA1 184 232 240 216 064 024 032 088 PA1 152 200 168 136 120 072 080 128
In this example threshold array, the rows of numbers correspond to a sequence of pixels in the X-direction of the two-dimensional raster, and the columns of numbers correspond to a sequence of scan lines in the Y-direction of the raster. The threshold array is replicated horizontally and vertically to fill the entire output frame. This example threshold array is capable of achieving 65 different halftone patterns or gray levels. In an area uniformly shaded with a reference gray level of 128, this threshold array creates a checkerboard pattern of 4.times.4 pixel black and white rectangles. The black or white rectangles in the pattern may be called `halftone dots`. As the reference gray level decreases from 128 to 0, the black checkers shrink and disappear. Similarly as the gray level increases from 128 to 255, the white checkers shrink to white dots. When applied to an image with a uniform reference gray level, this threshold array creates a spatially uniform dot pattern in which the size of the dots varies according to the reference gray level. If this dot pattern is sufficiently fine (i.e., if it has a high spatial resolution), then a human viewer will see the desired gray instead of the dot pattern. For a device raster of 600 pixels/inch.times.600 scan lines/inch, the halftone dot pattern will be a grid with dot centers separated by 1/75th of an inch horizontally and vertically or by 1/106th of an inch diagonally.
A uniform array of halftone dots has both advantages and disadvantages in comparison to a more jumbled array. One advantage is that the threshold array is small. Another advantage, is that some lossless data compression of the halftone result may be possible in some situations. One disadvantage of a uniform array is that it is more vulnerable to visible pattern defects (often called Moire patterns) and other patterns which might form as a result of `beating` of the lines of halftone dots against features within the image content. For example, in the example threshold array, the two diagonal black or white dots may differ by one pixel. A human viewer may perceive an undesirable pattern due this difference.
With a large threshold array, it is possible to create the appearance of a random, jumbled set of halftone dots in which the human viewer detects few or no repeated patterns. An example of this technique is Adobe Brilliant Screens (as described in U.S. Pat. No. 5,590,223). The threshold array is typically a square that is 64 to 2048 pixels on a side. Within the threshold array, many halftone dots of different sizes and irregular placement are formed, creating a jumbled pattern in which combinations of halftone dots do not appear to the human viewer to aggregate into patterns. A jumbled array greatly reduces Moire patterns. However, if a large area of uniform gray is displayed, the human viewer may perceive the repetition of a small tile pattern as a degradation. Therefore the threshold array must be large in order to spatially separate the tile boundaries so that the human viewer does not perceive the pattern. One disadvantage of a large threshold array is the correspondingly large amount of memory required to store it. Another disadvantage, as compared to a uniform array, is that the halftone result is less compressible with lossless data compression.
Threshold array and spot function techniques are members of a broader class of halftoning techniques in which the value of each pixel in the output frame is selected according to a reference gray level of the image over that position. The broad class of techniques may be described by the following functions:
reference gray level=function1(pixel position, image samples over or near pixel position)
pixel value=function2(pixel position, reference gray level).
With a traditional threshold array, the same threshold is used at a particular device position regardless of the image reference gray level at that position. Therefore, for a threshold array the second functional dependence may be specialized to:
pixel value=function3(reference gray level function4(pixel position))
A variation is to choose the threshold array according to the reference gray level; this results in the threshold at each position being a function of both the reference gray level and the position. In this variation, there could be a different threshold at every device position for every gray level. For example, different dot patterns may be used for very light and very dark regions of halftone output. However, this technique is slower and requires more memory than the traditional technique.
Although fine dot patterns may look best under ideal conditions, a consideration in designing a threshold array is that grainier halftone dot patterns sometimes give more consistent appearance across changes in temperature, humidity, toner or ink level, and other factors. In addition, grainier patterns may photocopy better. In sum, all of these techniques involve experimentation, judgment, and personal preference.
Various other digital halftoning techniques have been developed over the years. For example, error diffusion and FM screening, collectively referred to as stochastic halftoning, produce somewhat similar results. In these techniques, a semi-random process is used to create a pattern of very small seemingly randomly placed dots, while still closely controlling the average spacing of those dots and relative density of black and white. A summary of error diffusion and threshold array based halftoning can be found in the book "Digital Halftoning" by Robert Ulichney. A stochastic halftoning technique can also be found in "A Markovian Framework for Digital Halftoning" by Robert Geist et al. in the ACM Transactions of Graphics Volume 12 Number 2 (April 1993). This article explains halftoning based on random processes that generate results similar to error diffusion. Another stochastic halftoning technique is described in "Digital Halftoning Using a Blue Noise Mask" by Mista and Parker in The SPIE conference Proceedings, San Jose, 1991, and in a companion paper "The Construction and Evaluation of Halftone Patterns With Manipulated Power Spectra" by Mista, Ulichney and Parker, in the conference proceedings for Raster Imaging and Digital Typography in 1992. These papers describe a method of constructing dot profiles which are free of annoying patterns and which reproduce a desired gray level. Each of the above references is incorporated herein by reference.
A threshold array implementation of a stochastic halftoning method is possible if each pixel produced by the halftoning process depends only upon the reference gray level at its position in the output frame. Error diffusion is not implementable with a threshold array because it diffuses the gray level at one position into the determination of the reference gray level for nearby pixels. The halftoning techniques that are functionally equivalent to threshold halftoning are usually faster because they require less computation per pixel.
For some output devices, an advantage of threshold-array-equivalent halftoning is that it permits independent halftoning of different images which are adjacent to each other in an output frame. No seams are created at the boundaries between the objects because the pixel value at each pixel position depends only upon the image reference gray level at that spot and not upon diffusion values from adjoining pixels. Similarly, it may be advantageous in an output device to use the same set of thresholds for halftoning other content elements next to images such as strokes or fills.
Some output devices, such as laser printers, cannot stop and wait for halftoning processing to finish after starting to move paper. Other devices, such as inkjet printers or image setters, may develop a streak across the page if ink dries when they are stopped or if there is a slight mechanical irregularity when stopping and restarting. Consequently, if the halftoning process is slower than the output device, the fully-halftoned image must be constructed so that it may be presented to the device fast enough to avoid stopping the device. This construction requires extra time and extra memory. Even if a device can stop without creating a visual defect, the time lost waiting for the halftoning process to finish is a performance degradation.
For a non-stop device, all pixels of the output frame may be stored in a frame buffer. Only after all the pixels are stored does the device start to deliver the output frame (e.g., start paper motion on a printer). During delivery of the output frame, pixels are fetched from the frame buffer and fed to the output device. On a 1200 pixels/inch.times.1200 scan lines/inch.times.1 bit/pixel device with 8.5-inch.times.11-inch frame buffer, for example, the frame buffer size (assuming no white margins) is 16,830,000 bytes.
The term `rendering` refers to the process of constructing all of the pixel values in the output frame. The total time to deliver the output frame is therefore the total time to fully render the image plus the total time to deliver the output frame to the output device.
It may be possible to divide the construction of the output frame into two parts, termed `front-end interpretation` and `back-end rendering`, respectively. If back-end rendering can meet the real-time requirement of the output device, then delivery of the output frame can start as soon as front-end interpretation has finished, and back-end rendering can then occur during frame delivery. This is called `race rendering` (or, for a laser printer, `racing the laser`). In race rendering, total time to deliver the frame (e.g., to print the page) is the total time for front-end interpretation plus the time to deliver the output frame. An example of the race and non-race rendering is illustrated as follows:
______________________________________ Non-Race Race-Render ______________________________________ processing 3.5 sec. 2.0 sec. (front-end) frame delivery 2.0 sec. 2.0 sec. (overlapping) processing 1.5 sec. (back-end) total time 5.5 sec. 4.0 sec. ______________________________________
In this illustration, the total time to construct the page is 3.5 seconds in both cases. In the non-race example, the entire time is used for full halftone construction of the output frame. However, in the race-render example, 1.5 seconds of the back-end rendering is overlapped with frame delivery, which results in a faster overall time. Faster halftoning, by enabling overlap of computation with frame delivery, may enable a larger reduction in the total time to deliver a frame than that which results only from reduction of the halftoning time.
Unfortunately, back-end rendering in prior halftoning methods is too slow to meet the real-time requirements for most output devices, unless expensive hardware is used. For this reason, most devices using prior halftoning methods completely render an image before beginning to deliver the frame.
In addition to overlapping halftoning and frame delivery, it is advantageous to overlap the construction of the current frame with the frame delivery of the preceding frame, if possible. For example, if delivery of an output frame does not require all of the output device's computational power and memory, then the excess time and memory can be used to start construction of the current output frame.
Devices employing non-race halftoning also require memory for either the fully-halftoned representation of the image itself or the full output frame. However, if images are race-rendered, an output device does not require memory for the fully-halftoned representation of the image. Instead, the output device requires memory for the representation of the image from which back-end rendering starts and several buffers that are much smaller than the full frame. During the race, the fully-halftoned pixels for one region are put in a buffer. The device then delivers the pixels in the buffer to the output device as they are needed. Afterwards, the buffer may be reused on another region. This process continues using several buffers in rotation to construct and deliver the output frame.
In some cases, the image will be much smaller than the fully-halftoned representation, and back-end rendering may be fast enough to start with it. For example, a 2,000.times.1,500 pixel.times.8 bits/pixel monochrome image requires 3,000,000 bytes of memory (RAM). On a 1,200 pixels/inch.times.1,200 scan lines/inch.times.8.5 inch.times.11 inch 1 bit/pixel output device (e.g., a 1,200 dots/inch Letter page laser printer), the full frame would be 16,830,000 bytes. In this example, the image itself is much smaller than the output frame and race-render reduces the memory required.
In other cases, the image will be much larger than the output frame. In this case, the image may be resampled during front-end interpretation to reduce the memory requirement. In the resampled image, only those samples needed during back-end rendering are retained. The representation of the image after both resampling and front-end interpretation could be either larger or smaller than the fully-halftoned image. If the resampled representation is smaller, then race rendering reduces the memory requirement.
Some devices employing halftoning compress either the fully-halftoned representation of the image or the full output frame (see, for example, U.S. Pat. No. 5,602,976, incorporated herein by reference). State of the art lossless compression methods (e.g., the CCITT group 4 method used with facsimile devices, the JBIG method, and the Lempel Ziv Welch (LZW) method) do not compress fully-halftoned frame data particularly well, though they achieve substantial compression on text and line art. Although the CCITT group 4 method achieves satisfactory compression on text and line art, it typically expands halftone data rather than compressing it. The LZW method can achieve about 2.5-to-1 compression and the JBIG method can achieve about 4-to-1 compression if the halftoning method creates uniformly spaced halftoned dots. However, if the halftoning method creates irregularly spaced halftoned dots the JBIG method may not achieve even 2-to-1 compression. Lossy methods can effectively compress the fully-halftoned representation, but their modifications reduce the quality of the image. Since not much lossless compression is possible on the fully-halftoned representation of an image, devices applying compression to conventional halftoning are likely to have a high memory requirement or to use lossy compression. When compression is used, the time for compression typically increases the time to deliver a frame. The decompression might also add to the time to deliver the frame, or it might be overlapped with frame delivery.
Some devices construct the fully-halftoned image representation on another device (e.g., a computer such as a personal computer (PC)) and communicate that representation either directly, or after applying data compression to it, to the output device. In this case, the output device itself does not need memory for the fully-halftone output frame, but the communication channel must transmit more data than would be required with a smaller representation. In addition, the device which constructs the fully-halftoned representation may be slower or require more memory than would be the case for race rendering. Furthermore, real-time requirements of the output device are imposed not only on back-end rendering but also on the communication channel. Compression may ease the real-time requirement imposed by the communication channel, but it increases the computation requirement on each end.
Because conventional halftoning is slow, race rendering of images with frame delivery and overlapping computation is impractical on many devices. Furthermore, if the fully-halftoned representation is constructed on a PC or workstation with plentiful memory, the total time to construct this representation, possibly compress the representation, transmit the representation or compressed representation over a communication channel to the output device, and expand the compressed representation, slows the rate at which the output device can deliver frames.
It is useful to describe conventional threshold halftoning for a bi-level output device. As illustrated in FIG. 1, a conventional halftoning method 10 begins by copying the image samples into memory (step 12). This step may include color conversions, clipping and other preprocessing that is not central to the halftoning process itself.
The method iterates over all pixels underneath the image, i.e., the area of the output frame into which the halftoned result is to be put (step 14). On each iteration, the reference gray level for the pixel is determined by a technique such as point sampling, linear interpolation, or bi-cubic interpolation (step 16). The threshold array element that corresponds to the pixel is fetched (step 17) and these two values are compared (step 18). If the image sample is less than the threshold, then the pixel is assigned a value of 0 (white), otherwise it is assigned a value of 1 (black) (step 19). For the next iteration, the pixel position is incremented and the image position corresponding to the new pixel position is computed (step 20).
When the image and the output frame are both rasters, one way to arrange the iteration is as an outer loop over scan lines in the output frame that are covered by the image, a middle loop over computer words in a scan line that are covered by the image, and an inner loop over pixels in a word. For a 32-bit word size, the inner loop then runs 32 times while filling a register with the black (1) or white (0) pixel results. On exiting from the inner loop, the middle loop copies this register into the output frame. This organization reduces the number of memory references during halftoning.
During step 12 the number of samples handled is the width multiplied by the height of the image. During steps 14-20, the number of iterations is equal to the area of the output frame underneath the image as measured in pixels. For high resolution output devices, the preponderance of the time is usually in steps 14-20.