The approaches described in this section could be pursued but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section constitute prior art merely by virtue of their inclusion in this section.
Machine translation (MT), which is also known as computer-aided translation, is a rapidly growing field. It involves the use of computer software to automatically translate one natural language into another. MT takes into account the grammatical structure of a language and uses contextual rules to select among multiple meanings in order to translate sentences from a source language (to be translated) into a target language (translated). MT can be used to translate language within a variety of media such as speech, text, audio/video, web pages and so forth.
Statistical MT attempts to generate translations using statistical methods with parameters derived from the analysis of bilingual text corpora such as the Canadian Hansard corpus, the English-French record of the Canadian Parliament, or EUROPARL, records of the European Parliament, and the like. The idea behind statistical machine translation comes from information theory. Sentences are translated according to a probability distribution p(e|f) so that the string e in the target language (e.g., English) is the translation of a string fin the source language (e.g., French).
Statistical systems, may be based on the Noisy Channel Model initially developed by Claude Shannon in 1948, and generally can be interpreted as:
                              e          ^                =                                                            arg                ⁢                                                                  ⁢                max                            e                        ⁢                          p              ⁡                              (                                  e                  |                  f                                )                                              =                                                    arg                ⁢                                                                  ⁢                max                            e                        ⁢                                                  ⁢                          p              ⁡                              (                                  f                  |                  e                                )                                      ⁢                          p              ⁡                              (                e                )                                                                        Eq        .                                  ⁢        1            where the translation model p(f|e) is the probability that the source string is the translation of the target string, and the language model p(e) is the probability of seeing that target language string. Without going into the details, this approach states that the best translation ê (English) of a sentence f (foreign) is the sentence e that maximizes p(e|f). For a rigorous implementation of this approach, one would have to perform an exhaustive search by going through all strings e in the source language. Thus, the statistical MT models require training to optimize their parameters in order to achieve the highest translation results.