Conventional techniques have been proposed to read a document recorded on an original such as paper or a book as an image using an image reading device and conduct character recognition processing for the read image (see Japanese Patent Application Laid-Open No. H9-44604 and Japanese Patent Application Laid-Open No. H9-44606). Moreover, another conventional technique has been proposed to translate an original document (which will be hereinafter referred to as an original text) using a text outputted as a result of character recognition processing (see Japanese Patent Application Laid-Open No. 2011-100356).
A document image generating device described in Japanese Patent Application Laid-Open No. 2011-100356 generates an electronic document image (which will be hereinafter referred to as a translation-attached document image), in which a result (which will be hereinafter referred to as a translated text) of translation of an original text is attached in the form of rubi in a space between lines of the original text, on the basis of an electronic document image (which will be hereinafter referred to as an original text image) representing the original text.
In the following description, an expression included in an original text will be referred to as an original expression, and a translated expression to be attached to an original expression in the form of rubi will be referred to as translation rubi. Moreover, in order to clearly indicate a space character existing between characters, an underscore “_” is used instead of a space character “ ” in the following description. When translating English into Japanese, general nouns such as “Animation” and “ion” are normally translated. However, an indefinite article, an expression which is not listed in a dictionary used for translation (e.g., “ma”), and the like are not translated.
A document image generating device such as the device described in Japanese Patent Application Laid-Open No. 2011-100356 treats a result of recognition “Animation” as one word “Animation” when an original expression “Animation”, for example, is recognized correctly as “Animation” by character recognition. At this moment, correct translation rubi  (“animation” in Japanese) is attached to the original expression “Animation”.