One of the most used present day systems for transferring data from documents into a computer system is by the reading of magnetic ink character recognition symbols whereby magnetic ink is used to encode certain types of symbols which represent numbers and certain punctuation data symbols.
These magnetic ink characters are designated as E13B magnetic ink characters which are a designation given by the American Banking Association (ABA). E13B characters are the characters commonly found on personal and business checks both in the United States and certain foreign countries. The characters are printed in such a manner as to allow easy human recognition but at the same time capable of creating magnetic waveforms recognizable by machines.
These waveforms are created by passing the characters alongside of a magnetic read head. This read head creates an analog signal which is then amplified and then sampled at a constant rate to convert the signal into a series of digital samples. The samples are passed on through a digital signal processor and the start and end of each character is found.
Previous systems which used MICR recognition methods were generally based on analog type signals where the system found the right hand edge of a character based upon a peak of sufficient amplitude. This particular amplitude remained constant since the reading system was adjusted to make it so. Generally the first peak was placed on a location called tap zero and then there were 8 taps in the analog reader which corresponded to the places where a peak reading was intended to be located according to the character specifications of the E13B character standards.
One type of these earlier forms of character recognition systems is described in U.S. Pat. No. 3,221,303 entitled "Unexpected Peak Detector". This patent described a character recognition system which differentiated waveforms with respect to the peak or amplitude value at a set of certain sample points along the waveforms. In this patent, the expected amplitude character recognition system compared waveforms having a peak displaced from a data reference at a set of sample points along the waveforms corresponding to similarly displaced peaks on waveform patterns of a predetermined character in a font.
These older analog systems, in order to perform a recognition process on an unknown character, would operate such that the analog reader sampled the waveform at the remaining seven tap locations. All eight of the samples were simultaneously fed to a correlation network which weighted the amplitudes of the samples and assigned a character recognition to the unknown character that had been scanned. This character could be any one of the fourteen valid E13B codes, or else a "reject code" if a particular decision could not be made on the available data. This code was then sent for a final check to verify the validity of the code. This was accomplished by looking for taps which contained a peak where there should not be one, or a tap which did not contain a required peak. Thus the final check would only operate to turn a recognized character into a rejected character.
These older systems however involve a singular problem involving the speed presentation of the data. If a character was read either faster or slower then it was intended, then the peaks of the character would not fall on the proper tap positions. When this occurred, the good matches were not found and consequently many characters were rejected or substituted.
The presently disclosed center-line method is immune to difficulties caused by this problem because it finds the actual peaks of the character and does not take arbitrary samples. Furthermore, by working from the character center-line (to be described later herein) the errors, due to speed, are not accumulated at the "end" of the character with respect to the "beginning" scan read of the character.
The older analog systems, for reasons similar to the speed problem, further could not tolerate characters which were slightly misformed. Thus characters, while possibly being of the correct length, would have peaks which did not fall at the proper tap locations. In the present system to be described, the center-line will operate to find the peaks and not be derailed by this type of problem.
It should be understood that each character in the E13B font set was designed in such a way as to create peaks which land in one of eight different places. The "first peak" of the character, which is found in every character, is always a positive peak. The "last peak" in a character is always a negative peak and falls at certain undetermined points.
There is an idealized "template sheet" which has been produced to encompass each of the characters in order to locate certain transitions which occur. The eight places for a peak to fall are labelled 0-7 on the template sheet T of FIG. 3. The widest character is spread over the entire eight peaks and the narrowest character only reaches to peak number 4. A character will typically have 4 to 6 peaks. However, a degraded character will sometimes have more or less then its typical matching ideal character.