The present invention generally relates to a pattern recognition method for written, handwritten, typed or printed characters or graphics. The present invention is suitable for an optical character recognition, image processing and the like.
Recently, a pattern recognition method for recognizing written, handwritten, typed or printed characters or graphics has been widely developed. For example, a character recognition, which is an application of the pattern recognition is composed of the following three steps. A first step of the character recognition is a step of tracing a contour of a binary image of an unknown character obtained by raster-scanning the character. A second step is a step of extracting features of the contour of the binary image. The features of the contour are generally expressed with 4- or 8-directional codes. A third step is a step of identifying the unknown character by comparing the extracted features of the contour with features of a known character.
A conventional character recognition method having the above three steps is disclosed in the Japanese Laid-Open Patent Application No. 22994/1984, for example. The first step disclosed in the above publication is a step of sequentially moving a trace point. At this time, a next trace point subsequent to a current trace point (a trace point of interest) is designated by referring to all of four or eight pixels adjacent to the point of interest. A combination of the adjacent pixels referred to is compared with reference pixel combinations (which are registered in the form of tables). Each reference table defines a trace point to be shifted to depending on a combination of white and black pixels adjacent to a center pixel (corresponding to the pixel at the point of interest). Therefore, the next tracing point is designated by the table having the pixel combination which is the same as the adjacent pixel combination with respect to the pixel of interest.
The second step disclosed in the publication assigns to the pixel at the point of interest a directional code corresponding to the combination of white and black adjacent pixels. It should be noted that the combination of white and black pixels is also referred to at the second step. For this reason, the reference pixel combinations (tables) define not only the next trace point subsequent to the point of interest but also directional codes depending on the combinations of white and black pixels.
The third step in the publication divides the region surrounding the character to be recognized into M.times.N (M and N are integers) sub-regions and calculates a distribution of the directional codes (histogram) for each sub-region. Then, the distribution of the directional codes for each sub-region is compared with reference directional code distributions (reference histograms) of characters. Finally, a character of the reference histogram which has the shortest distance (difference) with respect to the histogram of the unknown input pattern is identified as the unknown character.
However, the above conventional character recognition method has the following disadvantages.
First, the next trace point subsequent to the point of interest, i.e., the current trace point is designated by referring to all of four or eight adjacent pixels. For this reason, the quantity of data to be processed at the first step is enormous and thus an extremely long time is necessary to trace the contour of the character. Further, it is necessary to prepare a memory for storing the tables defining the reference pixel combinations used for designating the next trace point and the directional code of the pixel of interest.
Secondly, the feature or the directional code of the pixel of interest is also determined by referring to all of four or eight adjacent pixels. Therefore, the data quantity to be processed at the second step is also enormous.
Thirdly, the process for dividing the region surrounding the character to be processed into a plurality of the sub-regions is very complex. This is because the division of the region must be carried out for the two-dimensional (M.times.N) region. In addition, the directional codes do not necessarily exist in all of the sub-regions. This is frequently observed for relatively simple characters. This means that there is a possibility that a wasteful process for the region having no directional code is executed.