The present invention generally relates to optical character recognition systems and more particularly to those systems which are utilized to automatically read symbols or alpha-numeric type characters which have been printed, embossed or otherwise formed on documents such as coupons, checks or invoices. The present system includes features which enhance the effective optical contrast of the patterns representing the characters, locate the characters within a prescribed window, and recognize the characters found, notwithstanding the presence of various extraneous marks.
In recent years there has been a significant trend toward automated data capture from documents by using optical character recognition (OCR) techniques. Such systems generally detect characters by sensing patterned contrasts in document reflectivity and electrically process those detected patterns to determine the characters. Unfortunately, the originators of the diverse documents from which data is to be captured have not only developed a multiplicity of different fonts to represent their data, but also allow significant latitude in the shapes, relative locations and angular orientations of such characters in a sequence. Even further variations occur when the characters are generated by different means. For instance, those generated by typewriters will differ significantly from those that are printed or embossed, notwithstanding the fact that they utilize the same font. The optical data capture function is further complicated by the diverse reflective characteristics encountered, not only from the document material but also from the materials used to form the characters. For example, inks of different colors, textures and amounts produce a significantly different optical reflection even when they appear on the same document material and are formed in the same style of font. Scenic backgrounds, fold lines in the documents, and differences in the reflective characteristics and paper quality of the document materials are further sources of extraneous signals during the location, capture, and recognition of character patterns and the data they represent.
In addition to the exemplary error sources described above, one cannot overlook the intentional or unintentional insertions of extraneous marks into that clear band established for the alpha-numeric characters. A typical example of such marks include signatures, banker stamps and carbon smudges. Consequently, the recognition technique must not only extract characters from an optically noisy background, but must, thereafter, perform reasonable evaluation of the characters to ascertain whether they are suitably represented to obtain an accurate determination of the actual data.
As the rate of processing documents increases, the costs attributable to both detected and undetected errors rise even more rapidly. Therefore, a commercially viable system must be capable of processing documents at a relatively high rate while maintaining exceptionally small rates of error in the data captured.