In the field of character recognition, character segmentation is an important step in text image processing, and is mainly implemented by performing segmentation on characters at the positions of the characters upon acquiring a text region in an image.
Conventional character segmenting methods include a projection segmenting method, a clustering method, and a template matching method. According to the projection segmenting method, an image is preprocessed to obtain a binary image, and regions where characters locate are determined by means of projection bases on the binary image. According to the clustering method in which a connected region of characters is used, character blocks in the connected region are merged based on a distribution feature of the characters in the whole page. The template matching method is mainly applied to specific fonts or specific characters and is not widely used.
With the above character segmenting methods, characters may be segmented to some extent. However, these methods are usually limited in practical applications. On one hand, in the projection segmenting method, a problem that multiple characters are segmented as a whole may occur in a case that the characters are slanting, while the template matching method may be only applied to specific text environments, resulting in lower availability.
On the other hand, for the clustering method in which the character segmentation is performed based on a connected region, the character segmentation cannot be implemented in a case where a stroke fracture phenomenon or a stroke cohesion phenomenon exist in an acquired character.
Therefore, there are many problems in the above character segmenting methods, and in particular, these methods are limited in practical applications, resulting in low versatility and accuracy.