1. Technical Field
The present invention relates to the technique of processing data of an image such as a still image or video to detect a character in the image, and particularly relates to the technique for improving character detection accuracy in a case where the image has a complex background.
2. Related Art
There have been many conventional techniques of detecting a specific character (keyword) in an image (still image or video image). For example, Patent Documents 1 to 3 each disclose the technique of cutting out all character regions in an image, applying a character recognition processing to each of the cut out regions to convert to text data, and then judging whether or not each is a keyword to be detected.
However, it may be necessary to apply the recognition processing to all the characters cut out of the image in order to judge whether or not it is a character string to be detected. This leads to increase in processing time.
For example, if a recognition target character is in Japanese or Chinese, these languages each include a large number of characters (each include at least 3000 Chinese characters of the first level only and at least 6000 Chinese characters of the first and second levels). In order to execute the character recognition processing in these languages, a comparing processing with at least 3000 to 6000 characters is therefore required. The character recognition processing thus becomes high-load processing requiring significant time. A comparing processing with the keyword is further executed on all the recognized character strings, thereby further increasing the processing time.
There is increase in processing time in a case of processing a video image where the real-time property is required, than in a case of processing a still image.
Upon cutting out a character string, in general, pixel values suddenly change at the boundary between a character and a background. Edges are thus extracted with use of a Sobel filter or a Laplacian filter, and such a portion is extracted as a character string region. If there is a complex background, however, an edge is extracted also from a portion where pixel values suddenly change not at a character but in the background. This may lead to incorrect detection of the background as a character string region despite the fact that there is no target character string, resulting in low detection accuracy.
According to the technique disclosed in Patent Document 2, for example, in order to detect a telop in a television picture, a character string in the picture (that is, a telop) is detected with use of a feature of the telop. More specifically, with use of the feature that a telop mostly having a fixed color and concentration (typically in white in the case of a telop) stays in a same position for fixed time, a pixel satisfying the feature is extracted as a character string candidate. If the character string of a keyword to be detected however does not satisfy the condition of the telop, it is accordingly impossible to detect the detection target character string.
Patent Document 4 discloses the technique of initially specifying a region of a road sign or a signboard in an image based on a feature value such as intensity or roundness, extracting a character string region in the specified signboard region, and comparing with preliminarily prepared dictionary data, to recognize (characters on) the signboard.
In the technique described in Patent Document 4, the candidate region including the character string is narrowed to some extent by specifying the signboard region, thereby to improve efficiency of a character string detection processing. Similarly to Patent Documents 1 to 3 requiring the comparing processing with at least 3000 to 6000 characters, there is still increase in processing time.
The technique described in Patent Document 4 relates to a technique of character string detection in a signboard that is supposed to have remarkable contrast between a background color and a character color. In terms of detection accuracy, the technique of character string detection described in Patent Document 4 is therefore not applicable to detection of a character string in complex background colors. If the character string of a keyword to be detected is included in a region not satisfying the feature of the signboard, it is accordingly impossible to detect the detection target character string.
In contrast to the above technique, Patent Documents 5 and 6 each disclose the technique of detecting a target character string by comparing between images of character regions. More specifically, each character font configuring a specific keyword is initially read out individually and drawn to generate a character string image corresponding to the keyword. A similar image retrieval is subsequently conducted in an image, with the character string image as a key, so as to detect the keyword in the image.
According to the techniques described in Patent Documents 5 and 6, the character string is detected by comparing processing between images. There is no need to apply character recognition processing to all the character regions in the image, thereby leading to reduction in processing time in comparison to the techniques described in Patent Documents 1 to 4. Furthermore, the detection target character string is detected through matching processing between images. It is thus possible to adjust an allowable range of noises in the background by arbitrarily setting such as a threshold of a concordance rate. This allows noises in the background to some extent and solves defects such as completely failing to detect the detection target character string.
For example, the corner detection technique or the outline detection technique described in Non-Patent Document 1 can be used as the technique of detecting a feature value of a character in an image so as to be used for the comparing processing between images.    Patent Document 1: Japanese Unexamined Patent Publication JP 08-205043 A (Published on Aug. 9, 1996)    Patent Document 2: Japanese Unexamined Patent Publication JP 2006-134156 A (Published on May 25, 2006)    Patent Document 3: Japanese Unexamined Patent Publication JP 2008-131413 A (Published on Jun. 5, 2008)    Patent Document 4: Japanese Unexamined Patent Publication JP2008-287735 A (Published on Nov. 27, 2008)    Patent Document 5: Japanese Unexamined Patent Publication JP 10-191190 A (Published on Jul. 21, 1998)    Patent Document 6: Japanese Unexamined Patent Publication JP 2008-004116 A (Published on Jan. 10, 2008)    Non-Patent Document 1: Masatoshi OKUTOMI et al., “Digital image processing”, CG-ARTS Society Publisher, Mar. 1, 2007 (Second Impression of the Second Edition), pp. 208-210, 12-2 “Feature point detection”