1. Field of the Invention
The present invention relates generally to an apparatus and a method for extracting a text region for real time text recognition, and more particularly to an apparatus and a method for improving text recognition capability after automatically extracting a text region.
2. Description of the Related Art
Recently, due to the development of camera technology, a camera function has been incorporated in mobile communication terminals, such as PDAs (Personal Digital Assistants) and portable phones, making it possible to use the mobile communication terminal as an information acquisition device. Upon using a camera provided in such a mobile communication terminal, it may be easy to acquire any type of a text information image input in real time, and it may also be possible to obtain a text recognition result by recognizing and processing one or more texts from the input image. Such a text recognition method differs from existing scanner-based text recognition methods in that text, which is impossible to input with a scanner or from a paper document, can be easily acquired. In contrast the conventional method scans a paper document, recognizes the scanned file, and then converts the scanned file into a text file or an electronic document format.
The text recognition method using a camera as described above using text recognition fields which has been recently researched actively, since the method can be used in a mobile communication terminal which is not only capable of acquiring and recognizing a text regardless of a medium on which a text to be input is recorded, but also capable of being easily carried by a user. As an example of such application, there is a function for recognizing text, such as phone numbers and data, with a camera provided in a mobile communication terminal, wherein the function is called OCR (Optical Character Recognition).
Now, a conventional text recognition procedure will be described with reference to FIG. 1. Referring to FIG. 1, if image photographing for text recognition is initiated in step 102, the input image is converted into an image to be useful for text recognition, e.g. a gray image so as to acquire gray data, or subjected to necessary processing in step 110. In step 120, image pre-processing, such as adaptive binarization, text slant correction and separation of individual text, is executed. Then, the individual text is subjected to normalization processing, thereby being converted into a predetermined size in step 110, and features, each of which is capable of representing one normalized text image, are extracted in step 140. Then, in step 150, the extracted features are compared with previously stored features for individual text, respectively, and text having the most similar features are determined as a result of recognizing the input individual texts. Then, post-processing for recognition is executed which can correct or remove erroneously recognized or unrecognized texts.
A text recognition method using a camera has advantages in that a text image desired to be recognized can be easily input, and in that the result of recognizing the input text can be displayed in real time. However, since such a text recognition method is largely affected by ambient illumination unlike a text recognition method using a scanner, various pre-processing or post-processing functions shall be considered as important elements. In addition, the camera image text recognition has a problem in that stable recognition capability cannot be assured for the camera text recognition, due to the diversity of type of text to be text-input and text-recognized under unlimited environments.
Moreover, conventional techniques, such as business card recognition or other text recognition are typically used when text and the background thereof are relatively distinctly discriminated, the pattern of the background is simple, and the color of the text is darker than that of the background. However, the text images to be recognized in practice may be very diverse. In particular, frequently a background is not uniform, or the color of a text is brighter than that of the background. For example, text images of Chinese sign boards will often consist of a red background and a yellow text, and street signs or construction notice signs will often use a bright text and a dark background. For the text images inverted in terms of light and darkness of colors with respect to the background like this, it may be difficult to properly execute text recognition. For the convenience of description, such inverted relationship between the text image and the background in terms of light and darkness is referred to as “inverted” herein. Accordingly, in order to quickly determine if photographed text images are inverted, and then to execute inversion-processing as desired, what is needed is a technique different from existing methods used for text recognition. Inversion-processing is a measure for changing a color of a text to be darker than that of the background thereof and is used as such herein. That is, what is needed is an image pre-processing technique capable of recognizing and processing a text image having text surrounded by a non-uniform background and a text image having a text brighter than the background.
Therefore, in order to improve recognition capability for a text included in a camera input text image, what is needed are a pre-processing function adaptive to a camera characteristic and a photograph environment, and a post-processing function capable of confirming if a recognized text has been correctly recognized, and correcting incorrectly recognized text.