1. Field of the Invention
This relates to a technique for generating translation data after text contained in a document is translated from one language into another language.
2. Description of the Related Art
Various types of translation devices have been proposed, which receive image data of a document with or without graphics, translate text contained in a text region of the image data, and generate a translated document containing the translated text, or a document containing the translated text and the original graphics.
It is known to provide a technique that a text region and a graphic region of input data are separated using layout analysis, and characters in the text region are recognized for translation. The volume of resulting translated text is then compared to a size of an existing text region, so that the text region can be re-formed according to a result of the comparison. However, a graphic region is allocated on a next page if, as a result of the re-formation of the text region, the graphic region can no longer be allocated in the same page. Thus, due to changes in an allocation of a text region and a graphic region, a reader may have difficulty reading the translated document.
Further, since translation devices commonly used output an original document and a translated document in separate regions of the same page or in separate pages, it is often difficult for a reader to find correspondences between the original and the translated text. It is known to provide a technique of arranging a translated text between lines of the original text, thereby to reduce troubles caused to a user in finding correspondences between the original and the translated text.
However, a translated text contains a number and type of characters that are different from those in an original text; and as a result, a length of character strings of which a line of translated text consists does not match a length occupied by a line of the original text.