The present invention relates to image recognition systems and in particular to alphanumeric character image recognition systems.
Alphanumeric character image recognition systems can have numerous applications.
For example, in mail applications where it is necessary to read addresses written on letter envelopes, postcards or packages to be then sent to an assigned destination, the use of such systems is useful in automating the reading and sending operations, reducing the costs of labor presently employed to carry out the above operations, and also reducing service delays. These systems can also be used in the recognition of tax or census forms or in the recognition of medical prescription texts.
Another application of such systems, which have recently taken on ever growing importance, is tied to the computer miniaturization process. Indeed, in the provision of this process there has emerged a limit represented by the dimensions of the typical keyboard, which cannot be reduced below a certain size. To replace the keyboard, it was thought to use a small tablet connected to the computer. On the tablet the user can write alphanumeric characters in sequence with a pen. Thus, a recognition system is necessary to provide an interface with the computer.
It is known that, in general, an alphanumeric character image recognition system consists of three cascaded stages.
The first of these stages normalizes digital image signals to eliminate irregularities present in human writing. Aspects such as image size, character slant and defining line thickness are normally considered in this stage.
The second stage processes from the normalized digital image signals of the first stage image information that will be used by the third stage to perform classification of the images to be recognized.
In the literature, there are different descriptions of alphanumeric character recognition systems. For example, to mention the better-known systems, ATandT uses systems based on the so-called xe2x80x9ck nearest neighbourxe2x80x9d algorithm described in Pattern Classification and Scene Analysis by R. O. Duda and P. E. Hart, N.Y.: John Wiley and Sons, 1973, or systems based on multilevel perceptrons. The latter are described in the article xe2x80x9cLearning internal representations by error propagationxe2x80x9d by D. E. Rumelhart, G. E. Hinton, R. J. Williams, published in Parallel Distributed Processing, D. E. Rumelhart, J. L. McCleland and the PDP Research Group, publ. MIT Press, Cambridge, Mass. pages 318-362, 1986.
Systems based on the xe2x80x9ck nearest neighbourxe2x80x9d algorithm and those based on multilevel perceptrons are applied to digital image signals that are normalized in size and blurred with a Gaussian filter as described in the article xe2x80x9cHand-written Digit Recognition with a Back-Propagation Networkxe2x80x9d by Y. Le Cun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, published in Neural Information Processing Systems, D. Touretzky publ., Vol. 2, Morgan Kaufmann, 1990. AEG uses a system comprising a first stage which performs normalization of image size, character slant, and line thickness defining the characters, and a second stage based on the xe2x80x9cKarhunen Loeve Transformxe2x80x9d described in the article xe2x80x9cKarhunen Loeve feature extraction for neural hand-written character recognitionxe2x80x9d by P. J. Grother published in Proc. of Applications of Artificial Neural Network III, Orlando, SPIE, April 1992. A third stage included in the system is provided with a polynomial classifier that is known in the art.
IBM uses a system including a first and a second image information processing stage and a third stage provided by a multilevel perceptron.
In accordance with one aspect of the present invention an apparatus is provided for recognizing alphanumeric characters from first, second, and third signals carrying processed information from images of the characters. A first neural or classifier network includes a first input terminal that receives the first signal, a second input terminal that receives the second signal, and a plurality of output terminals. A second neural or classifier network includes a first input terminal that receives the second signal, a second input terminal that receives the third signal, and a plurality of output terminals. A third neural or classifier network includes a plurality of input terminals each coupled to one of the output terminals of either the first or second classifier networks, and a plurality of output terminals that carry statistical values corresponding to a predetermined classification of the images. The first, second, and third classifier networks carry out consecutive statistical operations on the processed information until the statistical values are generated.
In another aspect of the invention, a fourth neural or classifier network includes a first input terminal that receives the first signal, a second input terminal that receives the third signal, and a plurality of output terminals that are coupled to the input terminals of the third classifier network.
In still another aspect of the invention, the first, second, third, or fourth classifier networks may include a neural network that includes one or more levels of neurons. These neurons may have a sigmoidal activation function, i.e., may be sigmoidal neurons.
In yet another aspect of the invention, the first, second, and third signals may carry information including the position of the dark points of the images, the directions of the tangents to the edges of the images at the points that compose the edges, and the contours formed by the points that compose the edges.
An advantage of one aspect of the present invention is an alphanumeric character image recognition system having improved recognition quality as compared to the systems known heretofore in the scientific and industrial environment.