Intelligent recognition of bitmapped binary images of text for the purpose of estimating their corresponding character values is often referred to as optical character recognition (“OCR”). Most OCR systems in use today utilize stochastic processes to recognize the text in the graphic images. Because stochastic processes are fundamentally based on chance or probability, these systems are not always as reliable as may be desired. Moreover, the processing time of such stochastic processes can be quite high in some instances and thus not particularly practical.
One attempt to overcome some of the above-noted deficiencies is described in U.S. Pat. No. 5,321,773. The image recognition technique disclosed in the '773 patent is a grammar-based image modeling and recognition system that automatically produces an image decoder based on a finite state network. Although the system described in the '773 patent is substantially faster than the traditional stochastic processes, it is based on stochastic methods (like the traditional approaches) and thus inherently involves chance or probability. Another noteworthy disadvantage of the recognition system in the '773 patent is that it requires extremely detailed font metrics information for the characters to be recognized, including character sidebearings and baseline depths which typically cannot readily obtained. Yet another disadvantage of the image recognition system disclosed in the '773 is that it cannot recognize text when pairs of characters (which may be denoted by black pixels on a white background) have black pixels that overlap.
In view of the above-noted deficiencies, it would be desirable to provide an image recognition system that is capable of recognizing machine generated text in graphic images with (at least in most cases) complete accuracy. It would further be desirable to provide an image recognition system that is substantially faster than traditional OCR technology, but is also able to recognize text having characters with overlapping black (i.e., foreground) pixels. It would also be desirable to provide an image recognition system that is capable of recognizing machine generated text in graphic images using font metrics information that is readily obtainable.