1. Field of the Invention
The present invention relates to an image recognition device that recognizes images using a neural network comprised to simulate the cerebral nerve system of man and a similar input/output operation.
2. Prior Art
Recognition devices using a neural network have been developed and made commercially available for various applications, one of the most common being optical character recognition (see "Optical neurochip based on a three-layered feed-forward model" by Ohta et al. in Optics Letters, Vol. 15, No. 23, pp. 1362-1363, Dec. 1, 1990, or "Handwritten kanji character recognition using a PDP model" by Mori, Yokozawa, Umeda in Technical Research Reports of the Association of Electronic Communications Engineers, MBE87-156, pp. 407-414 (1988)).
FIG. 11 shows a block diagram of a prior art character recognition device that uses a neural network. The character input means 1 photoelectrically converts the image pattern of the character to the character data, which then outputs to the recognition means 2. This character data is a two-dimensional bit image such as that shown in FIG. 12. The recognition means 2 then applies the neural network to recognize the character. The recognition means 2 then outputs the recognition result to the storage means 3 or to a display device 4.
The operation of the neural network used by the recognition means 2 is described below referring to FIG. 13. The character data 5 generated by the character input means 1 is first input to the corresponding neurons 7 comprising the input layer 6. The character data received by each of the neurons 7 is then sent by each neuron 7 to all of the neurons comprising the hidden layer 9 through pathways 8 called 10 synapses. It is important to note that the character data sent over the synapses 8 is weighted before it is input to the neurons 10 of the hidden layer 9. The weighting value is called the "synapse weight." The neurons 10 of the hidden-layer 9 calculate the sum of all input data, and output the result which is obtained by applying a non-linear function to this sum. These output results weighted with the synapse weight of the synapses 11 are input to all neurons 13 of the output layer 12. The neurons 13 of the output layer 12 similarly calculate the sum of all input data, and output the result of a calculation applying a non-linear function to the sum to the maximum value detection means 14. The maximum value detection means 14 obtains the maximum value of all inputs from the output layer 12 neurons, and outputs the character corresponding to the neuron that output the maximum value as the recognition result to either the storage means 3 or display device 4.
The synapse weights used in the above process are determined by learning called back error propagation. The learning is continued to correct the synapse weight until the desired output result is obtained. The neural network is trained in several different styles and fonts to further improve the recognition rate.
Feature extraction is important process whereby the feature needed to recognize an image is obtained. The features are the reference for character recognition. Therefore, the recognition performance of the neural network depends on the quality of the features.
Conventionally applied neural networks simply use the mesh features (binary pattern) of the character data generated by the character input means. Essentially what this process does is to recognize characters by the feature denoting which parts of the image are black and which are white in the character data. If the shape and size of the characters differs, as with handwritten characters, from the characters used to train the neural network, however, the recognition rate drops significantly. To improve the recognition rate, the neural network must be learned with several dozen different fonts and styles of characters. Hence, the learning process is extremely time-consuming, and the recognition rate of the neural network is still less than 90% even when the application is limited to recognizing printed matter.
For the extraction of the line components and other features requires, the input image should be converted to electrical image and stored in an image memory and processed by the computer. Such processing is time-consuming, and the construction of the recognition system is complex. There are same problems in the recognition of non-textual two-dimensional images.