A simple bit-level correlation scheme for use in a character recognition apparatus entails a one-to-one comparison between each character pixel and its associated mask pixel. For purposes of this specification, a pixel is defined as an individual picture element. Every pixel is given a weight of one or zero, for a mismatch or match respectively. FIGS. 1a and 1b illustrate an example of this scheme. The resulting mismatch count, hereby referred to as the character score for the specific mask, ranges from 0 for a character which is completely identical to a mask, to 768 (i.e., 24.times.32) for a character which is the negative of a mask. This correlation scheme can be represented by the following equation: ##EQU1## where c.sub.xy is the character pixel value (either 0 or 1) at row x and column y, and m.sub.xy is the associated mask pixel value (either 0 or 1) and .sym. is the "exclusive-OR" logical operation. This technique produces the highest degree of resolution available in a bit matching system. In FIGS. 1a and 1b, pixels labeled x.sup.a in the character to be identified have no counterparts in corresponding locations in the mask, and pixels labeled x.sup.b in the mask have no counterparts in corresponding locations in the character. If the pixels are each weighted by a score of "1", the resulting mismatched score is the sum of x.sup.a +x.sup.b, which in the example equals 24.
Maximum resolution is not necessarily desirable for purposes of character recognition. The resulting scores are representative of absolute pixel variations and are not necessarily responsive to character feature variations. An example of such high resolution creating improper identification is shown in FIGS. 2a-2c. Assuming that FIGS. 2a and 2b represent the mask fields for a lower case `l` and upper case `I`, respectively, an unknown character as diagrammed in FIG. 2c produces scores of 9 for the I mask and 18 for the l mask, as summarized below:
______________________________________ Pixel Mask Mismatch Weight Score ______________________________________ Mismatched Score: 1 18 .times. 1 = 18 I 9 .times. 1 = 9 ______________________________________
The unknown character, a slightly degraded lower case `l`, is identified as the wrong symbol. Cases such as this occur frequently when a large spectrum of characters are processed. The correlation scheme favors the wrong mask, due to the fact that each pixel mismatch carries the same weight, regardless of the location of that pixel with respect to the main strokes of the characters. Note that in this example the mismatching pixels of the upper case `I` mask are farther away from the nearest matching pixel, whereas the mismatching pixels of the lower case `l` mask are only one pixel distance away from the nearest matching pixel.
To counteract this shortcoming, a correlator can be developed that weighs each pixel mismatch as a function of that pixel's distance to the nearest matching pixel. A large weight would be attached to a mismatched pixel that is located a great distance from the nearest matching pixel, and a small weight would be attached to a mismatched pixel that is adjacent to a matching pixel. An example of this grading scheme is shown in FIGS. 3a and 3b for the examples in FIGS. 1a and 1b. A mismatch one pixel away from a matching pixel is given the weight of 0.5; a mismatch two pixels away from the matching pixel is given a weight of 2; and, mismatches n pixels away from the matching pixel are given the weight n.sup.2. Thus, pixels `a` are one pixel distance from the nearest matching pixel, and a weight of 0.5 is given for `a` pixels. Pixels labeled `b` are two pixels away, and are weighted 2. Pixels labeled `c` are three pixels away, and are weighted 9.
The resulting mismatched score is:
______________________________________ 8 `a` pixels (.times. .5) = 4 pts. 8 `b` pixels (.times. 2) = 16 pts. 8 `c` pixels (.times. 9) = 72 pts. 24 Pixels = 92 Pts. ______________________________________
The example of FIG. 2 with a distance-weighted correlation scheme is illustrated in FIG. 4, with the resulting mismatched score as given below:
______________________________________ Pixel Mask Mismatch Weight Score ______________________________________ Mismatched Score: 1 18 .times. .5 = 9.0 I 3 .times. .5 = 1.5 3 .times. 2 = 6.0 3 .times. 9 = 18.0 27 = 25.5 ______________________________________
Hence, the unknown character is properly identified as a lower case "l". Unfortunately, a distance-weighted correlation implementation is both costly and time-consuming, requiring a conversion from bit-level weighting having only two states to multiply level weighting having many states.