Field of the Invention
One disclosed aspect of the embodiments relates to character recognition processing.
Description of the Related Art
In conventional character recognition processing performed on a document image acquired by scanning a sheet document, the outlines (contours) of characters are detected from the document image, a character image is cut for each of the characters, and character recognition processing is performed on the cut character images to identify the characters. The characters are not correctly recognized if the cutting positions of the characters are inappropriate, so that techniques for correcting the cutting positions of characters according to a user instruction are provided. For example, in one technique, in a case where a single character image is cut as a plurality of characters (e.g., a case where a single character image is cut as two characters due to a blurred portion of the character image, and a case where a single Chinese character is divided into a left and right Chinese character radicals and cut), the plurality of characters is corrected as the single character. Further, Japanese Patent Application Laid-Open No. 11-143983 discusses another technique in which if a user corrects a character recognition result, a portion that is incorrectly recognized in a similar way is searched from uncorrected portions, and a similar correction is applied to the searched portion.
Further, the increasing use of smartphones, digital cameras, and other devices in recent years has enabled acquisition of image information including character information with ease. This leads to a development of a large market relating to acquisition of character information by character recognition processing in a variety of measurement environments. For example, there is a use case in which serial numbers engraved on tires of dump trucks are used to manage the tires in a quarry such as a mine. In a possible management method, images of the serial numbers engraved on the tires are captured with a smartphone, a digital camera, or the like, and character recognition processing is performed on the captured images to obtain recognized serial numbers to be used to manage the tires. However, if a captured image such as a captured image of a serial number engraved on a tire has a low contrast between characters and a background or contains noise due to significant contamination on a surface, accurate detection of the outlines of the characters is difficult with the conventional techniques.
In a case where the conventional techniques in which characters are cut based on the outlines of the characters is applied to an image from which the outlines of characters are difficult to detect accurately, the cutting positions of the characters are often inappropriate, and the burden of correcting recognition results on the user increases.