Optical Character Recognition (OCR) systems are generally used to detect text present in an image. An OCR system detects text present in the image and converts the text in the image into its equivalent electronic representation. The electronic representation can be stored and manipulated on a computer or an equivalent data processing machine. In order to accurately recognize the text in an image using OCR, the image should be of a high quality. The quality of the image depends on various factors such as the power of the lens, light intensity variation, relative motion between the camera and text, focus, and so forth. An OCR system can accurately detect text in good quality images, captured with mega-pixel cameras or scanned with high quality flatbed scanners, which have uniform intensity, no relative motion, and good focus. Conversely, an OCR system generally misinterprets text in poor quality images with relative motion between the camera-and-text, high level of intensity variations, and poor focus. Poor quality images are generally captured by using low-resolution digital cameras or mobile phones with built-in digital cameras or by unskilled users. Moreover, since an OCR system cannot perform semantic analysis of the text present in an image for mapping detected text onto some standard character, the OCR system cannot assign meaning to the text in many signs that have a stylized appearance, such as retailer product signs. This results in misinterpretation of the text in the image, which may include errors such as splitting a word into two separate words, concatenating two words into one word, missing characters in a word, loss of adjacency information between words, and so forth.
A known method for improving the quality of an image before sending it to an OCR system is disclosed in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 366-373, 2004 titled, “Detecting and Reading Text in Natural Scenes”. The paper discusses an algorithm for detecting and reading text in natural images. The algorithm includes collecting a set of images. The set of images is analyzed and detected text is extracted from it. The statistical analysis of the text is performed to determine the image features that are reliable indicators of text and have low variation among them. Another known method of improving the text detected by an optical character recognition system includes comparing words or phrases found in an image to a dictionary or database, and correcting words that are relatively close to known words. However, these methods do not deal with the editing of text in a context-independent manner, or with the inference of higher-level information such as identifying a retailer based on the physical layout of text in an image.
In light of the above discussion, there is a need for a method, system and computer program product for improving recognition of text in captured images, using low-quality image capture instruments, such as mobile phones with cameras and low-resolution digital cameras.