The present invention relates to a method and circuit for segmentation of characters in which portions or cut-outs of a character sequence are continuously stored in a memory in the form of an image pattern matrix.
Devices and methods for automatic character recognition must have the capability of isolating individual segments associated with separate characters within the character sequence. In those cases in which the character images are continuous for each character, and are completely surrounded by white or non-printed regions, such character segmentation causes little difficulty, because the black-free column which is present at the left and right of the character markings provides a sufficient indication for a character separation. This is known as "white column" segmentation.
This ideal case, however, occurs relatively seldom in the scanning of general scripts, and for this reason other segmenting methods known to those skilled in the art, and which are significantly more complicated, must be employed. One such conventional segmenting method is known as "white path" segmentation, wherein a white path or channel which is not necessarily vertical, but which runs continuously from the top to the bottom of the character, is sought. The white path contains no printed areas or dots and is employed to determine the separation between neighboring characters.
Another method known to those skilled in the art is the so-called "comb" segmentation method, which can be used with scripts of fixed character width. This method searches for columns with a minimum of black dots or printed areas and a column containing such a minimum is selected as a column for separating or spacing between adjacent characters.