The present invention relates to feature extraction systems for binary coded patterns, and more particularly the invention relates to an extraction method for use with optical readers for reading handwritten characters, etc., whereby the features of a binary coded pattern stored in a two-dimensional memory by a reading unit are extracted by obtaining feature quantities of high accuracy by means of accurate separation of the regions or segments.
While there are generally different views on the definition of the features of patterns, many studies made on the recognition of characters as well as the recognition of patterns have proved that the so-called quasi-phasic features of a character or pattern such as the concavity, loop and connectivity are very important for the recognition. To date many different methods have been proposed for the purpose of extracting such quasi-phasic features and these methods can be roughly divided into the following three types of systems from the hardware point of view. More specifically, the first type is a pattern contour tracking system which may be called as a serial system, the second type is one which extracts the features of a pattern by for example raster scanning in accordance with the relation between the rows of the pattern, and the third type extracts the features by parallel processing of the whole pattern. Referring first to the third type, though there has been a progress in the LSI techniques, the use of this system in practical application requires an excessively huge cost. On the other hand, the first type has been put in practical use, although this system requires a rather long time. However, this system has a very serious disadvantage that the application of this system to the ordinary patterns is possible on the condition that an object pattern has been separated preliminarily into a plurality of segments for the tracing processing. As a result, while there will be no difficulty in the case of characters written properly within the character frame or in the case of printed characters, the application of this system is not suitable in the case of characters connected closely or in the case of ordinary patterns whose separation into segments is not necessarily an easy matter, and moreover in the case of characters which are arranged within a fixed character frame but are of complicated patterns such as "Kanji" or Chinese characters the system requires a correspondingly complicated processing making its application difficult.
Thus, the second system of extracting the features of a character by raster scanning in accordance for example with the connection between the black conditions (character digital bits) or the white conditions (background digital bits) of the successive rows of the digitized character pattern may be considered promising. This system is based in principle on the concept that the features of the succeeding two rows can be extracted and the relation between the two rows alone is considered. As a result, the required hardware for extracting the relation will be simplified considerably as compared with that of the third type system employing the parallel processing, and moreover the overall hardware can be simplified considerably through for example the parallel use of microcomputer for integrating the local features of the succeeding two rows.
Some studies have been made on the extraction of quasi-topological features in accordance with the above-mentioned connectivity characteristic between the succeeding two rows and the following disadvantages have been found to date. In other words, the scanning in rowwise direction alone has been unable to extract the concavities of any given character which are open to the right or left and thus it has been necessary to scan in the columnwise direction for this purpose. Also it has been impossible to extract L-shaped concavity features or weak concavity features and thus the use of diagonal scanning has been suggested. However, the use of diagonal scanning has been not only resulting in an awkward processing but only still inadequate for extracting the ordinary concavity features. In addition, as a matter of principle, this system is based on the integration of local features and it has the disadvantage of being susceptible to local noise. Moreover, practically no attention has been paid to the fact that the so-called concavity features may be of many different types of concavities.