This invention relates to automatic language translation.
Machine language translators accept input text in a first natural language (the source language) and generate corresponding output text in a second natural language (the target language). Such translators may be classified into two types; those which use a set of translation rules for each possible pair of source and target languages, and those (relatively rare) interlingual systems which translate from the source language into a language independent (interlingual) form, and then from this language independent form to the target language.
The former system has the disadvantage that as the number of languages rises, the number of sets of translation rules rises as the square of the number of languages. The latter approach is difficult to implement, and can result in unnatural translations, for example with loss of appropriate emphases.
A prior art document describing an automatic translation system in which translation into an interlingual form is proposed in J M VAN ZUILEN: xe2x80x9cHet automatisch vertaalsystem DLTxe2x80x9d INFORMATIE, vol. 32, no. 2, February 1990, DEVENTER, NL, pages 183-191, XP000406044. This document proposes the use of Esperanto, which is a natural language, as the interlingual form. However, when an interlingual form is ambiguous in relation to the target language(s), which will be the case when a natural language is used as the interlingual form, the interlingual form itself cannot be relied upon to provide a complete translation into the target language.
According to one aspect, the present invention provides a machine translation system utilising the interlingual approach (i.e. generating a generally language independent intermediate structure) in which modifiers (e.g. descriptive words or linguistic structures) which are capable, in the source language, of occupying more than one position are analysed and the position occupied is recorded. This enables adverbs or adjectives which have been placed in an unusual position for stress or emphasis to be translated into correspondingly stressed or emphasised descriptive terms in the target language.
In another aspect, the present invention provides a machine translation system for translating between a plurality of languages, in which grammar rules specific to the source language are applied to generate a semantic structure corresponding to the input text, and then semantic structures therein which are not shared by one or more of the target languages are detected and replaced with more generic structures, to generate an interlingual structure. This replacement will be referred to later in this document as xe2x80x9cabstractingxe2x80x9d. This aspect also provides such a translator in which the interlingual structure is tested for the presence of such generic structures which have specific versions within the target language not shared by the source or other languages, and such structures are replaced by the specific structures for the target language, the amended structure thus produced being used to generate target language text.
In another aspect, a machine translation system provides an interlingual form which is unambiguous in relation to all of the target languages the system is able to translate into, in the sense that the interlingual form corresponds directly, preferably uniquely, to a language-specific semantic structure in each of the target languages. Where a semantic structure in the source language text is itself ambiguous in relation to the interlingual form, a plurality of alternative interlingual structures may be selected between by interaction with the user in order to provide disambiguation in accordance with the meaning of the source language structure intended by the user.
In another aspect, the present invention provides a machine translation system utilising a generally interlingual approach, in which the process of converting from the source language to the language independent representation involves a user-interactive disambiguation process which takes account of the target language(s), to avoid the unnecessary disambiguation of linguistic elements which are common to the source and target languages.
This can significantly reduce the amount of interaction required by the user. It may also reduce the complexity of the abstracting process by which each source language is transformed into the language independent representation, which would otherwise involve an increasing number of transformations or rules with the number of target languages; although such rules must be present, only a subset of the rules need be used in any given translation process.
In yet another aspect, the invention provides a multilingual messaging system in which a message is transmitted from a first processor to one or more destination processors via a telecommunications channel, in the form of an interlingual semantic representation of the message.
Other aspects and preferred embodiments are as described in the following description and claims.