The present invention relates generally to methods and systems for image analysis, and specifically to methods for optical character recognition.
Algorithms for recognition of image features, such as optical character recognition (OCR) algorithms, are often sensitive to minor variations in the input image. One method for overcoming this sensitivity, and thus increasing the robustness of image recognition, is to apply a number of different algorithms to the same image. A xe2x80x9cvotingxe2x80x9d procedure is then applied to select a recognition result, such as the correct reading of a string of characters, from among the group of results returned by the different algorithms.
For example, U.S. Pat. No. 5,519,786, whose disclosure is incorporated herein by reference, describes a method and apparatus for implementing a weighted voting scheme for reading and accurately recognizing characters in a scanned image. A plurality of OCR processors scan the image and read the same image characters. Each OCR processor outputs a reported character corresponding to each character read. For a particular character read, the characters reported by each OCR processor are grouped into a set of character candidates. For each character candidate, a weight is generated in accordance with a confusion matrix. The weights are then compared to determine which character candidate to output.
As another example, U.S. Pat. No. 5,912,986, whose disclosure is incorporated herein by reference, describes an evidential confidence measure and rejection technique for use in a neural network-based OCR system. The technique is used to generate a confidence measure for use in deciding whether to accept or reject each classified character.
It is an object of the present invention to provide an improved technique for image recognition, and particularly for OCR.
It is a further object of some aspects of the present invention to provide methods and apparatus for image recognition with enhanced robustness.
It is yet a further object of some aspects of the present invention to provide methods for evaluating the robustness of image recognition techniques.
In preferred embodiments of the present invention, an input image containing a string of one or more characters is upsampled to generate an enlarged image. The upsampling is preferably accomplished by adding pixels between the existing pixels of the input image and then setting the values of the added pixels using appropriate operations, as are known in the art. Preferably, the operations applied to the image include neighborhood operations, such as interpolation and edge enhancement, whereby the value of each added pixel is determined as a function of the values of the existing pixels in its neighborhood.
The enlarged image is then downsampled, or decimated, to generate a plurality of intermediate images. Preferably, the enlarged image is divided into blocks, and one pixel in each block is sent to each of the intermediate images, so that each of the intermediate images comprises a different subset of the pixels of the enlarged image. Most preferably, the downsampling and upsampling rates are selected so that the intermediate images are all equal in size to the input image. (Typically, one of the intermediate images is the input image itself.) An image recognition algorithm, such as an OCR algorithm, is applied to all of the intermediate images, so as to generate an intermediate recognition result, such as a character string, for each of the intermediate images.
The different intermediate results are compared in order to find a final recognition result. Preferably, a voting procedure is used to select as the final result the most commonly-occurring intermediate result, or an intermediate result with a high cumulative confidence score. When two or more of the intermediate results are in agreement, the final result can be selected with high confidence, even if the recognition algorithm did not indicate high confidence in the intermediate result of any individual intermediate image (and even when one or more of the intermediate images may have returned no result at all). Alternatively or additionally, the range of results and their scores are used to derive a measure of the robustness of the final result. The present invention is advantageous by comparison with voting-based methods of image recognition known in the art in that it overcomes small image artifacts that interfere with recognition without the heavy demand on computing resources required in order to apply several different algorithms to the same image.
There is therefore provided, in accordance with a preferred embodiment of the present invention, a method for recognition of features appearing in an input image, including:
upsampling the input image to generate an enlarged image including an increased number of pixels relative to the input image;
decimating the enlarged image to generate a plurality of intermediate images, each of the intermediate images including a different subset of the pixels in the enlarged image;
applying an image recognition algorithm to the intermediate images so as to generate respective recognition results; and
comparing the recognition results to generate an identification of the features appearing in the input image.
Preferably, upsampling the input image includes applying a neighborhood operation to assign values to the pixels in the enlarged image. Further preferably, decimating the enlarged image includes dividing the enlarged image into an array of blocks, each block containing a predetermined number of pixels, and assigning each of the pixels in each of the blocks to a different one of the intermediate images. Most preferably, the number of blocks in the array is substantially equal to the number of pixels in the input image, whereby the intermediate images are substantially equal in size to the input image.
In a preferred embodiment, the features includes characters, and wherein applying the image recognition algorithm includes applying optical character recognition to identify the characters.
Preferably, comparing the recognition results includes applying a voting procedure to select the identification that is supported by the recognition results of two or more of the intermediate images. Alternatively or additionally, comparing the recognition results includes assessing a level of robustness of the identification responsive to variations among the recognition results.
There is also provided, in accordance with a preferred embodiment of the present invention, a method for recognition of features appearing in an input image made up of a matrix of pixels having pixel values, the method including:
defining a family of neighborhood operations;
applying the operations in the family to the input image so as to generate respective intermediate images;
applying an image recognition algorithm to the intermediate images so as to generate respective recognition results; and
comparing the recognition results to generate an identification of the features appearing in the input image.
Preferably, defining the family of neighborhood operations includes defining one or more convolution kernels, and wherein applying the operations includes convolving the pixels of the input image with the one or more kernels.
There is additionally provided, in accordance with a preferred embodiment of the present invention, apparatus for recognition of features appearing in an input image, including an image processor, which is adapted to upsample the input image so as to generate an enlarged image including an increased number of pixels relative to the input image, to decimate the enlarged image so as to generate a plurality of intermediate images, each of the intermediate images including a different subset of the pixels in the enlarged image, to apply an image recognition algorithm to the intermediate images so as to generate respective recognition results, and to compare the recognition results so as to generate an identification of the features appearing in the input image.
There is further provided, in accordance with a preferred embodiment of the present invention, apparatus for recognition of features appearing in an input image made up of a matrix of pixels having pixel values, the apparatus including an image processor, which is adapted to apply a predefined family of neighborhood operations to the input image so as to generate respective intermediate images, to apply an image recognition algorithm to the intermediate images so as to generate respective recognition results, and to compare the recognition results so as to generate an identification of the features appearing in the input image.
There is moreover provided, in accordance with a preferred embodiment of the present invention, a computer software product for recognition of features appearing in an input image, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to upsample the input image so as to generate an enlarged image including an increased number of pixels relative to the input image, to decimate the enlarged image so as to generate a plurality of intermediate images, each of the intermediate images including a different subset of the pixels in the enlarged image, to apply an image recognition algorithm to the intermediate images so as to generate respective recognition results, and to compare the recognition results so as to generate an identification of the features appearing in the input image.
There is furthermore provided, in accordance with a preferred embodiment of the present invention, a computer software product for recognition of features appearing in an input image, the product including a computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to apply a predefined family of neighborhood operations to the input image so as to generate respective intermediate images, to apply an image recognition algorithm to the intermediate images so as to generate respective recognition results, and to compare the recognition results so as to generate an identification of the features appearing in the input image.