1. Technical Field
The present invention relates to translation systems for translating a source language into a target language and, more particularly, to a system and a method employing statistics to disambiguate and translate a source language into a target language.
2. Discussion of Related Prior Art
There has long been a desire to have machines capable of translating text from one language into text in another language. Such machines would make it easier for humans who speak different languages to communicate with one another.
In general, machine translation systems fall broadly into two categories: rule-based and statistical. The rule-based approaches suffer from many deficiencies in comparison to statistical approaches. The main deficiencies of the rule-based approaches are complexity, and the requirement of a human expert in order to add new rules and to resolve problems which arise when a newly added rule conflicts with a pre-existing rule. For example, if the system is being used to translate newspaper articles, then new words and new usages of old words may appear as events unfold. On the other hand, a statistical system can simply be retrained with new training data. Ambiguities and conflicts with previous usages of words are automatically resolved by the underlying statistical model.
Prior work on machine translation (for both categories of systems) has focused on accuracy at the expense of speed. Thus, it would be desirable and highly advantageous to have a machine translation system which is fast enough to translate a large corpora of text, while still being sufficiently accurate to perform information retrieval. It would be further desirable for such a machine translation system to be statistically based.