1. Field of the Invention
The present invention relates to an image recognition system, and more particularly to an image recognition apparatus and method which recognize characters, identify the types and positions of objects, and so on, by extracting a previously defined "form" or "shape" from an input image.
2. Description of the Prior Art
One of the problems that have occurred so far when any automatic image recognition processing system is implemented on a computer or microprocessor is the large number of computations or logical and/or arithmetic operations to be performed for that processing. When image information is to be processed, it usually contains a great number of dots to be processed. A general-purpose processor that processes such image information in a sequential manner may be used for this purpose, but it would require a large amount of time for that processor to perform even a simple sequence of operations.
There is one approach that addresses the problem. This approach involves a dedicated image processing system that provides parallel processing functions for allowing every dot in the image information to be processed concurrently in a consistent manner.
It should be understood that the term "parallel processing" as referred to herein does not always mean a concurrent processing scheme for every dot in the image, but may mean a pipelined processing scheme that performs the same operations. In this parallel processing system, the operations on every dot in the image have priority over other procedures, and its control structure is specialized or simplified to enhance the processing efficiency. A dedicated image processor implements this conceptual architecture.
A dedicated image processing system has the following restrictions. When one dot in the image is currently being processed, any other dot that is adjacent to that dot must be processed independently of that dot. The image processing that occurs under those restrictions is called "local parallel processing".
In the local parallel processing system, its most primitive operations for the image include "erosion" and "dilation". In the "erosion" operation, a logical "AND" operation is performed for a given dot in a set of binary image information and any peripheral (neighbor) dot that is located adjacent to that dot. The output of the AND operation provides a new value for that dot. Suppose that there is a binary image that contains a view or picture, represented by a binary "1", and a background, represented by a binary "0". If then those logical operations are applied to the whole binary image, they will provide a view from which the peripheral areas have been removed. This is called the "erosion" operation. For the "dilation" operation, the logical "OR" operation is performed for the particular dot and any other adjacent dots. This provides an enlarged peripheral area in the view or picture. When a sequence of the "erosion" and "dilation" operations is performed, the first "erosion" operation will remove any fine dots or lines in the image information, followed by the subsequent "dilation" operation that will restore any large clusters nearly to their original states. The result is that only the "erosion" and "dilation" sequence may be performed on the local parallel processing system, in order to identify any clusters and the dots or lines in the image information and thereby to recognize them. This sequence expands the usual logical operation to the two-dimentional neighbors, which is one of the features of the dedicated image processor.
Another feature of the parallel processing scheme is the ability to process each dot and provide the corresponding result, independently of the other dots. Each individual result is consistent and highly reliable. For sequential image processing, when an image is processed to obtain some features, and if an error occurs for a dot during the processing, the error might affect the results for all dots including that dot. For parallel image processing, however, every dot is processed independently of the other dots so that, if an error occurs for a dot, that error will only affect the result for that dot, rather than the results for all dots. In this sense, parallel image processing can provide the image features with reliability.
Although the features of the parallel image processing scheme have been described as above, this scheme has restrictions on what it can process. It has been found that when the "erosion" and "dilation" operations are performed on a set of image information during parallel processing, it is impossible to distinguish the dots from the lines and recognize them as such. More specifically, during the "erosion" and "dilation" sequence, the dots as well as the lines, will be erased when the first "erosion" is performed.
For example, suppose that only the line features are to be extracted from a set of image information. If this occurs, the prior art practice is that the dedicated image processor, which has been processing the image information, will have to pass its control over to an appropriate general-purpose processor, which will take over the subsequent processing tasks. It may be appreciated that the parallel processing functions can only be used partially when the total processing steps are performed. Thus, the data and the procedures used to process the dots will have to be transferred between the dedicated image processor and the general-purpose processor. This will cancel the functions that are provided by the high-speed dedicated image processor. In order to make full use of those functions of the dedicated image processor, the parallel processing operations that may be performed on the dedicated image processor must be increased to meet its enhanced processing power requirements.
From this respect, the inventors of the present invention have developed an extended version of the "erosion" and "dilation" operations, which is called the "MAP" (Multi-angled Parallelism) technique (for additional information, refer to the publications "Directional Feature Field and Paralleled Operations for Binary Images: Twisting Operation for Arbitrarily Directional Propagation by 8-Neighbors", by Hiromitsu Yamada and Kazuhiko Yamamoto, Trans. IEICE Japan, Vol. J72-DII,, No. 5, Pages 678-685, May 1989, and "Feature Extraction from Topographical maps by Directional Local Parallel Operations for Binary Images", by Shinji Matsui, Hiromitsu Yamada, Taiichi Saito, Shigeru Muraki and Kazuhiko Yamamoto, Technical Report of IEICE Japan, Vol. PRU88-76, Nov. 1988).
For the prior art "erosion" and "dilation" operations, a new value is obtained from the information that represents a given dot and any other peripheral dots that surround that dot, and is updated to reflect the value for that dot. This processing occurs concurrently for all of the peripheral dots.
For MAP processing, each of the peripheral dots is not considered as being uniform. Rather, it employs the concept of a "directional plane" which is defined by a particular image information and which corresponds to a particular dot and each of its neighboring dots, and looks at each neighboring dot by using this concept. The logical and arithmetic operations are also performed for the direction of each dot. The extensions provided by the MAP technique allow the dots and the lines to be identified separately, whereas the prior art technique can only distinguish large clusters from the small dots and fine lines. In addition, the MAP technique allows for use of parallel image processing operations as a means of extracting wider-range geometrical features from the image information.
It may be appreciated that both the "erosion" and "dilation" operations, and the MAP operation, may basically apply to the processing of binary image information, but those operations may be extended to include the processing of multiple-value image information, by replacing the AND and OR operations with the MIN and MAX numerical operations, respectively.
The fact that the directional features may be extracted during the visual information processing steps is important, and this has already been verified by discoveries of the relevant facts in the visual physiology field, as well as by experimental attempts to analyze those facts in the character recognition and various image recognition schemes. The development of the MAP technique has made it possible for parallel processing to be used to extract such "lines", i.e., those features processing the directions possible.
Here, again, when parallel processing that provides high-speed functions, as described above, is considered from the aspect of the extended processing contents, demands occur for the higher-order features to be extracted. The possible higher-order features may include "forms" or "shapes" that may be represented as a set of lines. If the features of each respective line, and any arbitrary "form" or "shape" defined as the relationship between those lines, can be extracted from a particular input image, a mechanism that allows any higher-order features to be extracted may be implemented.
When this is viewed in a different way, a technique for extracting such "forms" or "shapes" may become a technique for recognizing objects. Specifically, for optical character recognition, recognition of the types and locations of objects as viewed by robots, recognition of biological organisms and internal organs in medical images, and so on, the problem of extracting any previously defined "forms" or "shapes" from the particular input image information is the principal problem to be solved. At present, the existing technique for extracting such "forms" or "shapes" may be said to encompass an "image recognition" technique implemented by a machine.