The present invention relates in general to machine translations. More specifically, the present invention relates to computer aided input segmentation for machine translation.
Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and humans using languages (i.e., natural languages). As such, NLP is related to the area of human-computer interaction. Among the challenges in implementing NLP systems is enabling computers to derive meaning from NL inputs, as well as the effective and efficient generation of NL outputs. Included among NLP systems are machine translation systems.
In general, machine translation quality declines as a length of an input text increases, which includes cases where an input text is a single sentence with a complicated composite structure or is a paragraph (with possibly incorrect punctuation) consisting of multiple sentences. This is particularly troublesome for conventional machine translation graphic user interfaces (GUIs). For instance, while a conventional machine translation GUI translates an input text in an input box, it does not allow a user to select part of the input text for translation, divide the input text into different segments, or specify a particular translation order of the input text. Users, in turn, manually divide the input text into a main clause, subordinate clauses, and/or phrases and then separately enter these clauses into the input for individual translation to maintain quality.