Conventionally known is a technique that reads items printed on an information recording medium (hereinafter denoted as a “medium”) of a cash card and various ID cards such as an employee ID card, a student ID card, an alien registration card and a driver's license, as image data with an OCR (Optical Character Recognition) process to implement a character recognition of the printed items. In a general OCR processing, the characters on a medium are optically read by a scanner, the scanned character image is segmented, a pattern matching is implemented between the image of the segmented character and a character pattern prepared in advance, and the character pattern that best matches the segmented character is extracted to recognize the image of each character.
In general, for a character recognition in the OCR processing, the line segmentation position (the segmenting position of the character line) is specified and the partitioning positions between characters are specified in the character line whose position was determined, to segment each character in the character line. Also there are cases in various ID cards that the area on the medium, in which characters are printed, has a character line decorated with a linear drawing (such as an underline or a repeated pattern with dots, lines or half-tone in its background). In a general OCR processing, when a linear drawing is present in the area in which characters are printed, the characters may be segmented at wrong positions and therefore, the characters may not be recognized. Even if the character can be recognized, the general OCR process may be affected by the linear drawing present in the area in which characters are printed, remarkably degrading the accurate recognition rate of the characters. For this reason, the removal of the linear drawing has been performed conventionally when the image including the characters to be recognized contains a linear drawing.
Conventionally as a character segmentation device for segmenting each character from the character line in which a linear drawing is present in the character (writing) area, a technique is disclosed that, for character recognition of a driver's license number having oblique lines in the background of the character line, accurately and effectively reads the fifth to the eighth characters having oblique lines in the background (see Patent Reference 1). The character segmentation device disclosed in Patent Reference 1, in the image of the driver's license read by a scanner, rotates the image of each of the fifth to eighth characters having oblique lines in the background at the angle at which the oblique lines become horizontal, removes the black pixel components composing the lines in the horizontal direction (the oblique line components) from the rotated character image and extracts the feature vector of the character image to implement a character recognition using a secondary dictionary exclusively set for the rotated character image.