1. Field of the Invention
The present invention relates generally to an apparatus and method for generating an image for character region extraction, and more particularly, to an apparatus and method for generating an image, by which a background image is extracted from a candidate character region and a correct character region is extracted by using an inverted image of the extracted background image.
2. Description of the Related Art
A character extraction method generally involves recognizing a character included in an image and extracting the recognized character region.
Since a character included in an image usually provides important contents information, it is paramount to recognize a character included in an image, such as a moving image of a cellular phone, a name card image, or a signboard, for understanding of the meaning of the character.
To detect a character included in an image, various methods have been proposed for separating a background and the character.
First, a background and a character may be separated using binarization. The binarization is generally classified into full image binarization for performing binarization on the full image and local image binarization for performing binarization on a part of the full image. These methods advantageously separate a background and a character in simple manners when the background and the character are simple.
Second, there may be color clustering which separates a background and a character according to Red/Green/Blue (RGB) colors instead of individual maps, clusters pixels having similar colors, and generates maps for respective colors. This method can separate the character and the background even when characters have various colors.
Thus, a character in an image has conventionally been extracted using full-image binarization, local-image binarization, or color clustering.
However, the binarization scheme fails to normally separate a background and a character when an image is complex and characters have various colors.
Moreover, the color-clustering scheme cannot normally separate a background and a character when a character may be divided into several regions by an external effect such as light. Furthermore, when analog information is digitalized as shown in FIG. 1, a boundary between a character and a background is also separated due to aliasing 100 generated in the boundary region of an image, whereby a character stroke width is reduced when compared to the original image. As a result, in later character recognition, a portion of a character region may be missed due to the stroke width reduction.