1. Technical Field
The present invention relates to machine translation and, more specifically, to a method and apparatus for efficient translation memory searches based on multiple sentence signatures.
2. Description of the Related Art
The goal of machine translation is to translate a sentence originally generated in a source language into a sentence in a target language. In the traditional approach to statistical machine translation, tables of phrase pairs are used to generate translation hypotheses under a probabilistic framework. However, this traditional approach to machine translation risks generating sentences with unacceptable linguistic inconsistences and imperfections, such as syntactical, grammatical or pragmatic errors.
Recently, because of the availability of large translation memories, a direct search approach to machine translation has been explored. Translation memories consist of a large database of pre-translated sentence pairs. The underlying assumption in the direct translation memory search approach is that, if an input sentence (referred to as a query) is sufficiently similar to a previously hand translated sentence stored in memory, it is generally preferable to use such existing translation over the generated statistical machine translation hypothesis. However, for this approach to be practical, it should be possible to efficiently search large translation memories.