Correlation is a technique well known to those skilled in the art of developing character recognition methods. The process of recognizing an unknown character using correlation is comprised of the following steps: (1) acquiring a two dimensional array of pixels, (2) locating an unknown character in the two dimensional array, (3) computing the correlations between the unknown character and every member of a trained set of characters (otherwise known as a font), (4) recognizing the unknown character as the trained character with the highest associated correlation coefficient above a threshold.
The correlation between an unknown character and a trained character can be conveniently described using vector notation. That is, let the vector y denote the light values (relative scene reflectance, intensity, etc.) of the pixels of the unknown i character to be recognized. That is, let EQU y=[y.sub.1, y.sub.2, . . . , y.sub.N ].sup.T ( 1)
where y.sub.i denotes the light value of the i-th pixel of the unknown character and ().sup.T denotes the transpose operator. In this representation there are N pixels in the unknown character y. That is, the two dimensional array of pixels for the unknown character is represented as a one dimensional array by concatenating rows (or columns) into a single vector.
In a similar manner, let x denote the vector of light values of a trained character from a font, i.e., EQU x=[x.sub.1, x.sub.2, . . . , x.sub.N ].sup.T ( 2)
where x.sub.i denotes the light value of the i-th pixel of the trained character x. For simplicity, it is assumed that both the unknown character and the trained characters have the same number of pixels, N. If this were not true, the two vectors can be made the same size by appropriately increasing/decreasing the size of the unknown character y to that of the trained character x by utilizing the surrounding pixels in the image.
With these definitions, a normalized mean-corrected correlation (squared) R.sup.2.sub.xy between the unknown character y and the trained character x can be written as ##EQU1## where EQU x.sub.c =x-.mu..sub.x ( 3a) EQU y.sub.c =y-.mu..sub.y ( 3b)
are the mean-corrected character vectors and EQU x.sub.c.sup.T =[x.sub.c,1 x.sub.c,2 . . . x.sub.c,N ].sup.T( 3c) EQU y.sub.c.sup.T =[y.sub.c,1 y.sub.c,2 . . . y.sub.c,N ].sup.T( 3d)
and the i-th components of the mean vectors are given by EQU (.mu..sub.x).sub.i =.SIGMA.x.sub.i /N, i=1,2, . . . , N (3e) EQU (.mu..sub.y).sub.i =.SIGMA.y.sub.i /N, i=1,2, . . . , N (6b 3f)
According to the above description, R.sup.2.sub.xy is computed for all M trained characters of the font {x.sub.1, x.sub.2, . . . , x.sub.M } and the unknown character y is identified as that trained character x.sub.j that results in the highest correlation score among all the scores calculated.
An additional condition for a match is that the highest correlation score (R.sup.2.sub.xy).sub.max exceed some predetermined threshold (R.sup.2.sub.xy).sub.thresh. Otherwise, the unknown character does not match any of the trained characters.
The correlation (squared) as defined in equation (3) has several desirable properties. Namely, the correlation R.sup.2.sub.xy in equation (3) is insensitive to variations in illumination level and character contrast. That is, doubling the intensity or contrast of unknown character y does not affect the correlation score. This is a direct result of the normalization and mean-correction steps of equation (b 3) and can be easily proved (by those skilled in the art).
Equation (3) shows that the correlation squared R.sup.2.sub.xy is computed rather than the correlation R.sub.xy. This is done as a mathematical convenience to avoid having to compute the square root. Note that R.sup.2.sub.xy as given by equation (3) is bounded in the range of 0.00 to 1.00. In practice, the correlation (squared) is usually multiplied by 100 to provide a more convenient 0 to 100 scale.
By substituting equations (3a) thru (3f) into equation (3), R.sup.2.sub.xy can be written in a computationally efficient form as ##EQU2##
In this expression, the summations involving only x.sub.i can be computed prior to inspection. Thus, there are only three summations that need to be computed during an inspection: the cross-product term and the sum and sum-squared of y.sub.i. This results in a computationally fast algorithm.
The above describes the state of the art in performing character recognition using correlation of the light value of pixels of an unknown character with the various members of a font. However, there is a deficiency associated with the above method. That is, the above method requires the computation of the correlation (squared) R.sup.2.sub.xy for every trained character in the font {x.sub.1, x.sub.2, . . . , x.sub.M } before y can be classified. For a large font, the time required to compute all of the correlations may be prohibitive.
Thus, a character recognition method is desired that overcomes this deficiency associated with the state of the art correlation method. That is, a character recognition method using correlation search is desired. Correlation search provides a means of classifying an unknown character y without having to compute the correlation of y with every trained character in the font {x.sub.1, x.sub.2, . . . , x.sub.M }. For a large font, the time savings can be significant. Correlation search is described in this invention.