1. Field of the Invention
The present invention relates to an image processing device and a program product. In particular, the invention relates to an image processing device and a program product for outputting character images that have low probabilities for being identified of character codes as character image data cut out from the character images without converting them into character code data.
2. Description of the Related Art
Latest image recognition devices can recognize character images as character codes with extremely high accuracies as long as a document is of good scanning conditions (for example if a document is made up of a single font type). However, if the quality of characters on the document is bad, or if the layout such as characters on the document is complicated, the recognition accuracy drops substantially and character image recognition errors occur more frequently.
In order to cope with such a problem, a character recognition device has been proposed wherein the character images that have high probability of recognition errors are outputted as character images (e.g., bitmap type image data) without converting them into character codes. Such an image recognition device eliminates the probability of outputting character codes that do not match with the characters on the document.
However, if only the characters that have high probability of error recognitions are outputted as character image data, it may cause mismatches between the shapes of the character image data and the character code data as shown in FIG. 1 and the user may feel objectionable to the mismatches. (The areas shown in rectangles are those that are cut out as character image data.)
FIG. 2A shows a case in which images of characters with kerning are outputted using a conventional image recognition device, and a marked difference can be seen between the character image data and the character code data.
Kerning is a technique to adjust the distance between two adjacent characters when the adjacent characters are printed as a combination to give them a more spatially balanced feeling. In FIG. 2A, the character code data “e” is arranged close to the right bottom corner of the character image data “W” using the kerning technique. As can be seen from it, a portion of the left side of the character code data “e” is overlapped and hidden behind the character image data “W.” FIG. 2B shows a case of outputting the character images expressed in italics using a conventional image recognizing device, in which the differences can also be seen explicitly.
The right bottom corner of the character image “W” expressed in italics contains a left-side portion of the character image “e,” which is offset from the character “e” outputted by the character code data. Also, the left top corner of the character image “n” expressed in italics contains a right-side portion of the character image “k,” which is offset from the character “k” outputted by the character code data.