The present invention relates to a method and an apparatus for isolating an area corresponding to a character or word in an optical character recognition (OCR) device, such as a character recognition reader, or the like. The term "cut down" or "isolate" is defined as segmenting the area corresponding to a character or word and taking it out.
Up to now, an area corresponding to a character is isolated in an optical character recognition device in the following manner:
First, an original manuscript image to be read-out by a scanner is scanned in the direction of the line (or column) of the character image in order to obtain a projection towards an axis perpendicular to the line of the character image, and an area of continuing projection wider than a certain constant value (i.e., an area of the projection is obtained by summing up the number of the black picture elements while continuing over a range wider than a certain predetermined width) is isolated as one line. Next, the isolated line is scanned in a vertical direction in order to obtain a projection towards an axis parallel to the line of the character image, and then an area of the continuing projection wider than another certain constant value is isolated as one character in the same way as mentioned above. However, if the original manuscript is inclined considerably, the projection toward the axis perpendicular to the line of the character image continues over a range of a plurality of lines, so that the operation of isolating the area of the respective lines cannot be normally performed. As a result, the operation of isolating the areas corresponding to the characters cannot be normally performed either. Although the problem of character isolation has been described heretofore, there is also a problem with regard to word isolation. Such problems as discussed above arose in the prior art.