The present invention relates to an optical character reader (OCR) and more particularly to an optical character reader using parallel state machines for recognizing characters. The present invention also relates to an optical character reader for detecting the amount of tilt or skew present in each line or column of characters in a document.
One type of OCR page reader device is disclosed in U.S. Pat. No. 4,453,268 to Britt, issued June 5, 1984, entitled "OCR Page Reader". The optical character reader in this patent employs parallel state machines each connected to receive vertical bit slices or columns of data in the forms of ones and zeroes respectively indicating the presence of black and white in vertical columns of a scanned character. Each column data is simultaneously applied to a plurality of separate state machines with one state machine being provided for each character that can be recognized. Each state machine includes a separate field programmable logic array (FPLA) device and an associated output latch connected in wraparound fashion to the FPLA.
One figure of that patent, reproduced in a slightly modified form as FIG. 2 herein, shows a simplified state diagram or map for recognizing the upper case character E in an OCR-B type type-font, as shown represented in another figure of that patent, which is reproduced as FIG. 1 herein. The state machine diagram of FIG. 2 comprises four steps or stages. In state 1, the initial stage, the device looks for the presence of black in vertical positions 1 through 11 (designated as pattern #1), indicating the left most vertical limb of the upper case E. If found, the state machine will advance to state 2, otherwise it will remain in state 1. When in state 2, the state machine looks for pattern #2, which is a pattern of three rightwardly extending horizontal limbs distinctive for the E, i.e. black in upper vertical positions 1, 2; black in middle vertical positions 5, 6 and 7; and black in lower vertical positions 10 and 11. If pattern #2 is found, the state machine will advance to state 3, but if pattern #1 is still being found, the state machine will stay in state 2. For all other patterns, the state machine will reset to state 1. When in state 3, the state machine looks for the end of the character indicated by pattern #3, which is all zeroes in vertical positions 1 through 11. If such is the case, the state machine will advance to the fourth state and output a signal indicating that the character is recognized as an upper case character E. However, if pattern #2 is still being detected, the state machine stays in state 3. For all other patterns, the state machine will reset to state 1.
Such a device is very useful for identifying characters on a document which are all in the same or similar type font. However, when type fonts are mixed on each page, or when scanning different documents produced with different type fonts, the relatively simple programming of that state machine as disclosed has its limitations in accurately identifying, with a high recognition rate, all of the characters appearing on the document.
Furthermore, in accordance with the device disclosed in this reference patent, the state machine can remain in a given state indefinitely as long as it continues to recognize the same pattern which caused it to advance to that state. For example, as shown in FIG. 2, the state machine will remain in state 2 if it continues to detect pattern #1, i.e. black in columns 1 through 11. In such an arrangement, there is no maximum time within which a character will be identified.
Further, no means are provided in this patent for detecting the amount of skew or tilt in a line of characters on a document, and to correct the relative positions of the characters so that a reprinting of the characters by associated printing means will correct for vertical and horizontal tilt or skew.