The disclosure of this patent document contains material that is subject to copyright protection. The owner thereof has no objection to facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the U.S. Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The invention pertains to image processing and, more particularly, to methods and apparatus for generating a projection of an image.
Image processing refers to the automated analysis of images to determine the characteristics of features shown in them. It is used, for example, in automated manufacturing lines, where images of parts are analyzed to determine placement and alignment prior to assembly. It is also used in quality assurance where images of packages are analyzed to insure that product labels, lot numbers, xe2x80x9cfreshnessxe2x80x9d dates, and the like, are properly positioned and legible.
Image processing has non-industrial applications, as well. In biotechnology research, it can be used to identify constituents of microscopically imaged samples, or growth patterns in culture media. On the macroscopic scale, it can be used in astronomical research to find objects in time-lapse images. Meteorologic, agricultural and defense applications of image processing include the detection and analysis of objects in satellite images.
In applications where speed counts, image processing systems typically do not analyze every feature in an image but, rather, infer most from analysis of a significant few.
Thus, for example, the position and orientation of semiconductor chip can usually be inferred from the location and angle of one of its edges. Likewise, the center of an object can often be estimated from the positions of its edges.
One image processing technique for discerning salient features in an image is projection. Though a similarly named technique is commonly used in the visual arts to add depth to paintings and drawings, the image processing counterpart is used to reduce complexity and, thereby, to reduce the computational resources necessary an image. More particularly, image processing projections reduce the dimensions of an image while maintaining relative spacing among features along axes of interest.
In machine vision, the most common form of projection involves reducing an image from two dimensions to one. A projection taken along the x-axis, for example, compresses all features in the y-dimension and, therefore, facilitates finding their positions along the x-axis. Thus, it can be used to find rapidly the width of an imaged object. A projection along the y-axis, on the other hand, speeds finding the height or vertical extent of an object.
These types of projections are typically made by summing the intensities of pixels at each point along the respective axis. For example, the projection of an image along the x-axis is made by summing the intensities of all pixels whose x-axis coordinate is zero; then summing those whose x-axis coordinate is one; those whose x-axis coordinate is two; and so forth. From those sums can be inferred the x-axis locations of significant features, such as edges.
In many instances, it is desirable to project images along arbitrary axes, not merely the pixel grid defined by the x- and y-axes. For example, if a semiconductor chip is not aligned with the x- and y-axes of the video camera which images of it, the locations of the chip leads can best be determined by generating a projection along an axis of the chip itselfxe2x80x94not the axis of the camera.
The prior art offers two principal solutions: one that operates quickly, but with low accuracy; the other, that operates more slowly, but with greater accuracy. The former involves summing the intensities of pixels falling between parallel lines normal to the angle of the projection and spaced in accord with the projection bin width. The sum of intensities formed between each pair of neighboring lines is stored in a corresponding projection bin. Though such techniques can be employed in tools that operate sufficiently quickly to permit their use in real time, their accuracy is typically too low for many applications.
The other prior art solution is to transform the image prior to taking its projection. One common transformation tool used for this purpose is referred to as affine transformation, which resizes, translates, rotates, skews and otherwise transforms an image. Conventional affine transformation techniques are typically slow and too computationally intensive for use in real-time applications. Specifically, the prior art suggests that affine transforms can be accomplished by mapping a source image into a destination image in a single pass. For every pixel location in the destination image, a corresponding location in the source image is identified. In a simplistic example, every pixel coordinate position in the destination image maps directly to an existing pixel in the source. Thus, for example, the pixel at coordinate (4,10) in the source maps to coordinate (2,5) in the destination; the pixel at (6,10) in the source, to (3,5) in the destination; and so on.
However, rarely do pixels in the source image map directly to pixel positions in the destination image. Thus, for example, a pixel at coordinate (4,10) in the source may map to a location (2.5, 5.33) in the destination. This can be problematic insofar as it requires interpolation to determine appropriate pixel intensities for the mapped coordinates. In the example, an appropriate intensity might be determined as a weighted average of the intensities for the source pixel locations (2,5), (3,5), (2,6), and (3,6).
The interpolation of thousands of such points consumes both time and processor resources. Conventional affine transform tools must typically examine at least four points in the source image to generate each point in the destination image. This is compounded for higher-order transformations, which can require examination of many more points for each interpolation.
Although prior art has suggested the use of multiple passes (i.e., so-called separable techniques) in performing specific transformations, such as rotation, no suggestion is made as to how this might be applied to general affine transforms, e.g., involving simultaneous rotation, scaling, and skew.
An object of this invention is to provide improved systems for image processing and, more particularly, improved methods and apparatus for image projection.
A more particular object is to provide such methods and apparatus as facilitate generating projections of images (or objects) that have been rotated, skewed, scaled, sheared, transposed or otherwise transformed.
Another object of the invention is to provide such methods and apparatus as permit rapid analysis of images, without undue consumption of resources.
Still another object of the invention is to provide such methods and apparatus as are readily adapted to implementation conventional digital data processing apparatus, e.g., such as those equipped with commercially available superscalar processorsxe2x80x94such as the Intel Pentium MMX or Texas Instruments C80 microprocessors.
The foregoing objects are among those attained by the invention, which provides methods and apparatus for generating projections while concurrently rotating, scaling, translating, skewing, shearing, or subjecting the image to other affine transforms. In an exemplary aspect, the invention provides methods for generating a projection of an image by generating an xe2x80x9cintermediatexe2x80x9d image via a one-dimensional affine transformation of the source along a first axis, e.g., the y-axis. The intermediate image is subjected to a second one-dimensional affine transformation along a second axis, e.g., the x-axis. The resultant image is then projected along a selected one of these axes.
According to related aspects of the invention, there are provided methods as described above in which the first one-dimensional transformation determines a mapping between coordinates in the intermediate image and those in the source image. Preferably, the coordinates in the intermediate image lie at integer coordinate positions, e.g., coordinate positions such as at (1, 1), (1, 2), and so forth. Though the mapped locations in the source image do not necessarily lie at integer coordinate positions, they advantageously include at least one integer coordinate, e.g., coordinate positions such as (1, 1.5), (2, 4.25), (3, 3.75), and so forth.
Once the mappings of the first one-dimensional transformation are determined (or after each one has been determined), the method determines an intensity value for the pixel at each coordinate in the intermediate image. This is done by interpolating among the intensities of the pixels in the region neighboring the corresponding or mapped coordinate in the source image. Because the coordinate locations in the intermediate image are generated in sequences along a first axis, and because the mapped locations have at least one integer coordinate, interpolations are greatly simplified.
With the second one-dimensional transformation, the method similarly determines a mapping between pixel coordinates in a destination image and those of the intermediate image. This transformation proceeds as described above, albeit with sequences of coordinate locations that vary along the second axis.
Once the second transformation is completed, a projection of the resultant image is taken along the first or second axes by summing the intensities of pixels (or counting pixels with intensities above a threshold) in each column or row along that axis.
According to further aspects of the invention, the two one-dimensional affine transformation, together, effect a general affine transformation of the type described by the mathematical relation:             [                                                  x              s                                                                          y              s                                          ]        =                  M        ·                  [                                                                      x                  d                                                                                                      y                  d                                                              ]                    +              [                                                            x                o                                                                                        y                o                                                    ]                  M    =          [                                                  e              11                                                          e              12                                                                          e              21                                                          e              22                                          ]      
where
(xd, yd) represents a coordinate in the destination image;
(xs, ys) represents a coordinate in the source image;
(xo, yo) is an offset to be effected by the transformation; and
M is a transformation matrix.
According to further aspects of the invention, the transformation matrix M is decomposed into left and right triangular matrices (otherwise referred to as upper and lower matrices, U and L, respectively) in accord with the following mathematical relation:       [                                        e            11                                                e            12                                                            e            21                                                e            22                                ]    =            [                                                  l              11                                            0                                                              l              21                                                          l              22                                          ]        ·          [                                                  u              11                                                          u              12                                                            0                                              u              22                                          ]      
In a related aspect of the invention, the matrix elements l11 and u22 are set to integers, and, preferably, are set to one.
In accordance with related aspects of the invention, a method as described above performs the first transformation, or first xe2x80x9cpass,xe2x80x9d in accord with the mathematical relation:       [                                        x            s                                                            y            s                                ]    =                    [                                                            u                11                                                                    u                12                                                                        0                                                      u                22                                                    ]            ⁡              [                                                            x                t                                                                                        y                t                                                    ]              +          [                                                  INT              ⁡                              (                                  x                  0                                )                                                                                        y              0                                          ]      
where
(xs ys) is a coordinate in the source image;
(xt yt) is a coordinate in the intermediate image;
(xo yo) is a translational offset to be effected by the transformation;
INT(xo) is the integer component of xo; and       [                                        u            11                                                u            12                                                0                                      u            22                                ]    ⁢      xe2x80x83    ⁢                              is          ⁢                      xe2x80x83                    ⁢          the          ⁢                      xe2x80x83                    ⁢          upper          ⁢                      xe2x80x83                    ⁢          partial          ⁢                      xe2x80x83                    ⁢          transformation          ⁢                      xe2x80x83                    ⁢          matrix          ⁢                      xe2x80x83                    ⁢          attained                                              by          ⁢                      xe2x80x83                    ⁢          decomposition          ⁢                      xe2x80x83                    ⁢          of          ⁢                      xe2x80x83                    ⁢          the          ⁢                      xe2x80x83                    ⁢          transformation          ⁢                      xe2x80x83                    ⁢          matrix          ⁢                      xe2x80x83                    ⁢                      M            .                              
is the upper partial transformation matrix attained by decomposition of the transformation matrix M.
In a related aspect, the second partial transformation, or second xe2x80x9cpass,xe2x80x9d is effected in accord with the mathematical relation:       [                                        x            t                                                            y            t                                ]    =                    [                                                            l                11                                                    0                                                                          l                21                                                                    l                22                                                    ]            ⁡              [                                                            x                d                                                                                        y                d                                                    ]              +          [                                    FRAC                                              (                              x                0                            )                                                            0                                              xe2x80x83                                          ]      
where
(xt yt) is a coordinate in the intermediate image;
(xd yd) is a coordinate in the destination image;
(Xo) is the x-axis component of the offset to be effected by the transformation;
FRAC(xo) is the fractional component of xo; and       [                                        l            11                                    0                                                  l            12                                                l            22                                ]    ⁢      xe2x80x83    ⁢                              is          ⁢                      xe2x80x83                    ⁢          a          ⁢                      xe2x80x83                    ⁢          lower          ⁢                      xe2x80x83                    ⁢          partial          ⁢                      xe2x80x83                    ⁢          transformation          ⁢                      xe2x80x83                    ⁢          matrix          ⁢                      xe2x80x83                    ⁢          attained                                              by          ⁢                      xe2x80x83                    ⁢          decomposition          ⁢                      xe2x80x83                    ⁢          of          ⁢                      xe2x80x83                    ⁢          the          ⁢                      xe2x80x83                    ⁢          transformation          ⁢                      xe2x80x83                    ⁢          matrix          ⁢                      xe2x80x83                    ⁢                      M            .                              
is a lower partial transformation matrix attained by decomposition of the transformation matrix M.
Still further aspects of the invention provide methods as described above in which the mappings between pixel coordinates in the source and intermediate images are determined iteratively. Thus, for example, once a mapping has been determined for one coordinate in the intermediate image, a mapping for the next coordinate is obtained as a function (e.g., summation) of the prior mapping. For example, on determining the x-axis coordinate of a coordinate in the source image that maps to a coordinate in the intermediate image, the x-axis coordinate that maps to the adjacent coordinate in the intermediate image may be obtained using the relation:
xs[i+1, j]=xs[i, j]+1
where
xs[i, j] is the x-axis coordinate of a location in the source image that maps to coordinate (i, j) in the intermediate image;
xs[i+1, j] is the x-axis coordinate of the location in the source image that maps to coordinate (i+1, j) in the intermediate image.
Likewise, the y-axis coordinate of a location in the source image can be determined iteratively in accord with the following relation:
ys[i+1, j]=ys[i, j]+l21
where
ys[i, j] is the y-axis coordinate of a location in the source image that maps to coordinate (i, j) in the intermediate image;
ys[i+1, j] is the y-axis coordinate of the location in the source image that maps to coordinate (i+1, j) in the intermediate image;
l21 is a parameter from the lower partial transformation matrix, as described above.
In a related aspect, the invention provides methods as described above in which the mapping between coordinates in the intermediate image and the destination image are determined in accord with the iterative relations:
xt[i+1, j]=xt[i, j]+u11
yt[i+1, j]=yt[i, j]+1
where
xt[i, j] is the x-axis coordinate of a location in the intermediate image that maps to coordinate (i, j) in the destination image;
xt[i+1, j] is the x-axis coordinate of a location in the intermediate image that maps to coordinate (i+1, j) in the destination image;
yt[i, j] is the y-axis coordinate of a location in the intermediate image that maps to coordinate (i, j) in the destination image;
yt[i+l, j] is the y-axis coordinate of a location in the intermediate image that maps to coordinate (i+1, j) in the destination image.
Still further aspects of the invention provide methods as described above in which the first partial transformation determines mappings between the intermediate and source images for only those coordinates in the intermediate image that will ultimately be mapped into the destination image. Put another way, rather than mapping the entire region of the source image that lies within the bounding box enclosing the intermediate affine xe2x80x9crectangle,xe2x80x9d methods according to this aspect of the invention map only those portions that lie within the destination image.
Rather than determining a source-to-intermediate mapping for each pixel coordinate in a row of the intermediate image, a method according to this aspect of the invention can limit those determinations to the region in each row that is offset from the prior row by an amount u12, which is a parameter of the upper partial transformation matrix as defined above.
More particularly, having determined the x-axis coordinate of a location in the source image that maps to the first element (0, j) in a row of the intermediate image, a method according to this aspect of the invention can determine the x-axis coordinate of a location in the source image that maps to the first element (0, j+1) in the next row of the intermediate image in accord with the relation:
xs[0, j+1]=xs[0, j]+u12
Still further aspects of the invention provide methods as described above utilizing modified forms of the foregoing mathematical relations in order to effect projections where the underlying affine transforms are for angles outside the range xe2x88x9245xc2x0 xe2x88x9245xc2x0.
Yet still other aspects of the invention provide digital data processing apparatus, e.g., machine vision systems operating in accord with the above-described methodologies.
Those and other aspects of the invention are evident in the drawings and in the description that follows.
Methods and apparatus according to the invention have many advantages over prior art projection techniques. At the outset, they permit projections to be taken at any angle, skew, scaling, translation or other affine transform without undue consumption of resources and with speeds suitable for real-time applications. For example, by performing a two-pass affine transform as described above and, then, projecting the image along one of the axes, no complex or special purpose techniques are necessary to sum the pixel intensities.
Moreover, the unique two-pass affine transformation is, itself, superior to prior art affine techniques. For example, in the case of bilinear transformations, the two-pass transformation demands fewer pipeline stages to resolve data dependicies during interpolationxe2x80x94and, therefore, permits better utilization of superscalar processor execution units. In addition, that transformation permits processor registers to be used more efficiently, e.g., because linear interpolations require fewer computations than bilinear interpolations. In the case of higher-order interpolations, the two-pass transformation demands fewer operations. Particularly, the number of operations required by the prior art is proportional to n2, where (n) is the order of interpolation. The number of operations required by the invention, on the other hand, is proportional to 2 n.
The two-pass affine transformation utilized by the invention has other advantages over the prior art. Specifically, although separable transformation techniques are known, they cannot be applied to general affine transformation but, only to individual transformations, e.g., rotation-only, scaling-only, etc. Thus, for example, in order to rotate, scale and skew an image using these prior art techniques, it is necessary to do three separate transformations (each requiring at least two passes). The affine transformation utilized by the invention permits this to be accomplished and in a single, two-part act.