1. Field of the Invention
The present invention relates to a technique for extracting a character area from a captured image.
2. Description of the Related Art
By capturing an image of characters printed on a commodity or product with an image acquisition device, for example, a two dimensional image acquisition device using a CCD, CMOS or the like, and performing a character recognizing process in an image processing apparatus, a process of recognizing the print can be automated.
To perform the character recognizing process with high precision, a character extracting process as a pre-process of the character recognizing process is important in the image processing apparatus.
The character extracting process is a process of determining a character area included in a captured image. In a case where a captured image includes a character string made of a plurality of characters, each of the character areas corresponding to each character in the character string has to be determined from the character string.
One of the methods of extracting a character string is a method utilizing projection data of an image. Specifically, waveform data obtained by integrating pixel values of a captured image in an extraction direction is generated and analyzed. A fact is utilized that a pixel integration value of a character part is larger than that in a background part (in a case where a character is black, it is sufficient to make the pixel integration value of the character part large by reversal), and an area in which the pixel integration value exceeds a predetermined threshold is recognized as a character area.
FIG. 18 is a diagram showing an image 90 of a medium on which characters “T60” are printed and waveform data 91 generated from the image 90. The waveform data 91 are data obtained by integrating pixel values in a character extracting direction Y at a coordinate position in a character string direction X of the image 90. For easier explanation, the waveform data 91 and the image 90 including the characters “T60” are shown so as to be aligned in the character string direction X. It is understood from the figure that the pixel integration values of the character portions are larger than the pixel integration values of the background part. Therefore, as shown in FIG. 18, by setting a threshold 92 shown in the figure, it is able to determine that areas having pixel value integration values above the threshold 92 are areas corresponding to characters to be extracted from the character string.
When characters to be extracted in the acquisitioned image have clear contrast with a background in the image or there is no pattern or dirt in the background of the image, it is relatively easy to extract the character area precisely as described with FIG. 18.
However, when characters to be extracted in the acquisitioned image have unclear contrast with a background in the image or there is dirt or a rough surface like a random noise in the background of the image, it is more difficult to extract the character area.
FIG. 19 shows an image 93 of a medium on which characters “T60” and waveform data 94 generated from the image 93. The character “T60” of the image 93 has a low contrast against a background and the background has random noise. In such a case, when a threshold 95 is set in the same manner as the threshold 92 set in the waveform shown in FIG. 18, as shown in FIG. 19, there is a possibility to make a mistake in recognition of the character area. For example, since each area corresponding to a transverse line part of the “T” character and a center area of the “O” character have relatively small pixel value integration values as compared with the other area of each character and also have a relatively small difference of the pixel value integration values to the pixel value integration values of the background, it is possible that it is not able to recognize an appropriate character area.
Japanese Patent Publication No. 2,872,768 discloses a method of setting a search start point and a search end point in an image, integrating pixel values of pixels passing a path connecting the start and end points, and finding a path in which the integration value is the minimum. According to the method, although a character area can be extracted accurately, a search start point, a search end point, and a search area connecting the points have to be set in advance. That is, the method can be executed on the condition that a character boundary area can be predicted to some extent.