For years, machines have been used to scan parcels as they move along a conveyor. Over-the-belt optical character recognition (OCR) readers have been recently developed that can capture an image of the surface of a parcel as it moves along a conveyor, and then create and process a representation of the image. The fundamental physical components of an OCR reader are a sensor, an analog-to-digital (A/D) converter, and a computer comprising a memory. The individual physical components of an OCR reader are all well known in the art, and many alternative embodiments of each of the individual physical components are commercially available, with differing cost and performance characteristics. Much effort goes into finding the most efficient combinations of components for particular applications, and in the development of computer software programs that process the images created by these familiar physical components.
Charge-coupled device (CCD) sensor arrays are often used in OCR readers. A CCD camera consists of an array of electronic "pixels," each of which stores an accumulated charge according to the amount of light that strikes the pixel. A CCD camera is used to quickly capture an image of the surface of a parcel as it moves along a conveyor. The image is converted into digital format which may be stored as a bit map in a computer memory. The CCD array is then reset by dissipating the charge within the pixels, and the array is ready to capture the image of another parcel or section of a parcel. In this manner, a single CCD camera is used to scan a great many parcels.
Computers that may be used to process the images captured by CCD cameras vary in computation speed and other parameters. Generally, a faster computer is more expensive than a slower computer, a computer with a large memory capacity is more expensive than a computer with a smaller memory capacity, and a special purpose computer is more expensive than a general purpose computer. There is therefore a financial motivation to use low speed, low memory, general purpose computers whenever such are suitable for a particular purpose.
Parcel delivery companies, such as United Parcel Service (UPS), could make extensive use of OCR reader systems. UPS ships millions of parcels every day. If OCR reader systems were used by parcel delivery companies such as UPS they would generate an enormous amount of computer data. As a result, there is a need for computer systems that can quickly and accurately process the images created by CCD cameras. For example, computer systems have been developed that can attempt to read the destination address written on certain parcels, so that the parcels may be correctly routed to their destinations when the address is successfully read. Reading text is a sophisticated task, and the systems capable of doing so are commensurately sophisticated and may comprise expensive equipment such as high speed, high memory, or special purpose computers.
To the extent that less expensive equipment can perform less sophisticated tasks in an OCR system, more expensive equipment can be dedicated to reading text. Rotating a text image is an example of a function required of an OCR reader system that can be performed with less sophisticated equipment than that which is required to read text. There is therefore a financial motivation to rotate a text image using a general purpose computer. Similarly, there is a financial motivation to store text image data in a compressed format that reduces the memory required to store the image, to rotate the compressed image, and to store the rotated image in a compressed format.
There are a number of well known image processing techniques that are used to capture and store in a computer memory an image of a parcel as it is conveyed by an OCR system. For example, a two-dimensional "pixelized image" or bit map matrix representing an image of the surface of the parcel may be captured and stored in the computer memory using a CCD line camera that includes a single line array of electronic pixels or a plurality of lines operated sequentially. For example, a two array CCD line camera may operate two lines sequentially, with one line capturing an image while the other is being discharged and reset to capture another image.
A column of the bit map may be built by capturing a binary representation of the pixels of a CCD line array that is exposed as the parcel passes under the camera. A conventional analog-to-digital (A/D) converter and memory buffer can then be used to "shift" the pixelized column captured by the CCD line camera into the bit map. Subsequent columns of the image can be captured and stored by sequentially capturing a column of the image with the CCD camera, storing the column, and resetting the CCD camera.
An orthogonal coordinate system usually forms the basis for the bit map matrix. Thus, the bit map uniquely identifies the position of each pixel of the CCD array for each column captured. It is noted that a multiple line CCD camera or a two-dimensional array CCD camera may also be used to create bit map images. It is also noted that three-dimensional or higher-dimensional bit maps may similarly represent three-dimensional or higher-dimensional images in a computer memory, and that polar or other coordinate systems may similarly define positions within a bit map matrix. It is also noted that by storing values rather than bits, and/or by storing a series of bit maps for an image, an image including shading or color can be captured and stored in pixelized format.
In the continuous space, the coordinates of a rotated image are defined by the following standard rotation equations, where .phi. is the rotation angle, {x,y} is the coordinate of an input pixel in the input plane, and {x', y'} is the coordinate of that pixel as mapped to the output plane:
1. x'=x.cndot.cos(.phi.)+y.cndot.sin(.phi.) PA0 2. y'=x.cndot.sin(.phi.)-y.cndot.cos(.phi.)
A straightforward way to rotate an image is to map each input pixel to an output pixel using the above rotation equations. Mapping input pixels into an output plane using the above rotation equations is known as forward pixel mapping.
Since the above rotation equations involve floating point operations (e.g., multiplication by a non-integer), the coordinates of output pixels are generally not integers. Floating point operations applied to a digital or pixelized image can cause rounding errors. Moreover, forward pixel mapping does not guarantee that each pixel in the output image will have a counterpart in the input image. A hole may be created in the output image at the location of an output pixel that has no corresponding input pixel mapped to it. Therefore, forward pixel mapping can cause "artifacts" such as holes to appear in the output image. For example, a continuous line of black pixels, such as a stroke in a text image, may appear to have missing pixels or holes after rotation. Artifacts can be corrected to some extent through post rotation filtering. However, post rotation correction can be computationally expensive and may produce unsatisfactory results.
Reverse pixel mapping is a method for rotating pixelized images without creating artifacts. Reverse pixel mapping involves scanning through the output image, and finding a unique pixel in the input image to map to each pixel in the output image. As a result, an image rotated using reverse pixel mapping does not include artifacts, because every pixel in the output image corresponds to a pixel in the input image.
Reverse pixel mapping techniques suffer from certain disadvantages. For example, for pixelized images in which the pixels are laid out in an orthogonal grid, reverse pixel mapping can distort the size and/or aspect ratio of the rotated image. In addition, reverse pixel mapping is computation-intensive because it requires that all of the pixels in the output image, foreground and background, be mapped from the input image.
Pixelized bit map matrix representations of images are memory intensive because a bit or value must be used to represent every pixel. Many images, especially black and white text images, include large regions of similar pixels. For example, a text image generally includes a relatively small foreground image containing the text against a uniform background. Therefore, it is advantageous to process only the foreground pixels of such images.
It is conventional and memory efficient to compress a pixelized image in "run-length encoded" format. Run-length encoding a pixelized image may be accomplished by expressing each row of the image in terms of one or more runs. A run is a series of similar adjacent pixels. A run can be represented in a run-length encoded table by elements indicating either the starting and ending points or the starting points and a run lengths; the intermediate points of a run need not be explicitly represented in the table. For a text image, all of the information in the image is usually included in the foreground image. Thus, only the foreground pixels of a text image need to be represented in the run-length encoded table.
A pixelized text image including lines of pixels may be compressed and stored in a run-length encoded table that includes rows of elements, wherein each row of the table corresponds to a line of the image. Only the lines of the image that include one or more foreground pixels need to be represented by a row in the table. The elements of each row of the table define one or more runs within the corresponding line, usually by identifying the starting and ending pixels of the run. In this manner, a significant memory savings may be realized by storing a pixelized text image in run-length encoded format.
Moreover, a pixelized image of an object moving past a CCD line camera, as may be the case in an OCR system, can be translated into run-length encoded format for storage in the computer memory without having to create or store a full bit map representation of the image. This is possible because the OCR system creates the pixelized image one column at a time, moving from one edge of the image to the other. This "sweeping" of the image, which may be thought of as going from left to right across the image, allows a run-length encoded image including runs that go from left to right to be created as the columns of the image are shifted out of the memory buffer. Thus, the image is converted into run-length encoded format for storage in the computer memory as the image is captured by the CCD line camera.
It is noted that for convenience, images are described herein as being created and/or encoded "from left to right", and reference is made to the "left edge" of the input and/or output planes. However, it will be appreciated that any direction across an image or edge could be equivalently used by the inventive method.
With the above described system, images captured in the OCR system are initially stored in run-length encoded format. Recall that an image captured in an OCR system typically must be rotated before the image is sent to the text reader. Therefore, there is a need to rotate a run-length encoded image. It would be advantageous to rotate a run-length encoded input image into a run-length encoded output image without having to expand the image into a full pixelized representation. It would also be advantageous to rotate a run-length encoded input image without creating artifacts in the output image. It would also be advantageous to avoid distortion, i.e., to produce a rotated output image with the correct aspect ratio.
Cahill, III et al., U.S. Pat. No. 4,792,891, describes a method that translates a run-length encoded image into a run-length encoded representation of a rotated version of the image. The method described involves "(1) establishing each scan line of the input image as a series of `visible` and `invisible` vectors by comparing run-lengths in a current scan line with run-lengths for a previous scan line, (2) determining color transition information for a manipulated or transformed output image by means of transform coefficients and storing this transition information in memory bins to characterize each of the output scan lines, and (3) sorting the bins from one end of the output scan to the other and constructing a new run-length encoded image from the sorted run-length encoded information." Cahill, III et al. at column 1, lines 51-61. Visible vectors define the edges of the foreground of the input image, while invisible vectors define the non-edge foreground pixels. See, Cahill, III et al. at column 2, lines 43-46.
The visible vectors, those defining the foreground edge pixels of the input image, are manipulated by steps such as multiplication by coefficients which serve to size, slant, rotate, or otherwise transform the various vectors to achieve a different set of X and Y coordinates in the plane of the output pixel grid. Cahill, III et al. at column 4, lines 3-7. Once the "visible" vectors have been translated, which is equivalent to mapping the foreground edge pixels of the input image into the output plane, sorting the elements of the rows of the output image allows the output image to be encoded in run-length format. Thus, Cahill, III et al. describes a method for rotating a run-length encoded image in which only the foreground pixels of the input image need to be processed.
The method described by Cahill, III et al. suffers from a number of drawbacks. First, it relies on floating point operations (i.e., multiplication by coefficients) to map data (i.e., visible vectors). As discussed previously, floating point operations can be computationally expensive and can cause artifacts to be created in the output image. The system may also map foreground pixels from an orthogonal input plane into an orthogonal output plane without regard to reduction or expansion of runs of pixels. Therefore, the system may distort the output image, i.e., it may not produce an output image with the correct aspect ratio. The system also requires one or more steps to create "visible" and "invisible" vectors, and a sorting step after foreground edge pixels have been mapped, to allow the output image to be encoded in run-length format.
Hideaki, U.S. Pat. No. 4,985,848, describes a method for rotating a pixelized image using a look-up table. Values are computed in advance and stored in a lookup table for cos(x), 2.cndot.cos(x) . . . max.cndot.cos(x); and for sin(x), 2.cndot.sin(x) . . . max.cndot.sin(x), for each angle (x) for which a rotation may subsequently be performed. Thus, subsequent pixel-to-pixel mappings may be performed using only simple addition and reduction operations and without any multiplication. Hideaki also describes a method for correcting for artifacts in the output image. Generally described, the method corrects for artifacts in the output image by (1) transforming a line of pixels a first time, (2) identifying pixels within the line that are shifted by a small adjustment in the angle of rotation, and (3) transforming the line a second time from a temporary reference point that is slightly offset from the reference point used for the first transformation. Hideaki also describes an interpolation method whereby an output pixel location is determined by mapping and interpolating four points.
The method described by Hideaki suffers from a number of disadvantages. First, the method does not provide for rotating a run-length encoded image. Second, although the method avoids computationally expensive fixed point multiplication steps by using a look-up table, mapping pixels using the look-up table still produces artifacts in the output image. Third, the method relies on computationally expensive post rotation computations to correct for artifacts in the output image such as twice mapping pixels and interpolating.
Baldwin et al., U.S. Pat. No. 4,827,413, describes a method for displaying a two dimensional image of a three dimensional object. Baldwin et al. describes rotating a two dimensional image by first ran-length encoding the image, and then transforming the encoded image into a different rotational aspect. Transformation is accomplished by multiplying the vectors representing runs comprising the input image by transformation matrices. To avoid the creation of artifacts in the output image, a vector is plotted twice, with the second vector having a start address at a small offset from the first vector. Edge pixels are first transformed into the output plane so that only foreground pixels need be mapped.
Baldwin et al. suffers from a number of significant drawback because it relies on floating point operations to rotate vectors, and because it is a computationally expensive method of twice plotting vectors to avoid the creation of artifacts in the output image. In addition, Baldwin et al. does not describe a method for directly creating an output image in run-length encoded format.
Therefore, there remains after Cahill, III et al., Hideaki, and Baldwin et al. a need for a more computationally efficient method for rotating a run-length encoded image. In addition, there remains a need for a method for rotating a run-length encoded input image to produce a run-length encoded output image without creating artifacts in the output image, without relying on fixed point operations, without having to create a full pixelized representation of the input or output image, and whereby an output image is produced with the correct aspect ratio.
Thus, there is a great need for an improved method and system for rotating a run-length encoded image that can be used in conjunction with over-the-belt or other OCR readers. In particular, there is a great need for a rotation system that can quickly process a sufficiently large number of images so that the system can be used as an integral pan of an automatic parcel handling system used in the parcel delivery industry.
It would be advantageous for such a system to embody a number of important advantages including: (1) the use of low cost components such as a monochrome CCD line camera and a general purpose computer; (2) the ability to rotate an image without relying on floating point operations; (3) the ability to rotate an image by processing only the foreground pixels; (4) the ability to avoid the creation of artifacts in the output image; (5) the ability to rotate a run-length encoded input image to obtain a run-length encoded output image without having to create or store a full pixelized representation of the input or output images; (6) the ability to rotate a run-length encoded input image to obtain a run-length encoded output image with the correct aspect ratio; and (7) the ability to create a run-length encoded output image without having to sort pixels after they have been mapped to the output image.