Machine translation (MT) falls within the scope of computational linguistics, which uses computer programs to translate text or speech expressions from one natural language to another natural language. In a sense, glossary replacements between different natural languages are achieved. Further, with a corpus-based technique, more complex automatic translation can be achieved, thereby better processing different grammatical structures, glossary recognitions, correspondence of idiomatic expressions, etc.
The current machine translation tools can generally allow for the customization on a specific field or profession (such as weather forecast), with an objective of narrowing the translation on the glossary to a proper noun in the specific field, so as to improve the translation result. This technique is particularly effective for some fields that use more formal or more standardized presentation manners. For example, government documents or law related documents are usually more formal and more standardized than other documents using an ordinary literal expression, and accordingly the result of the machine translation for such documents is often better than that of informal documents such as dialogues in daily life.
However, the quality of the machine translation usually depends on the differences between a source language and a target language in terms of glossary, grammar structure, linguistics, and even culture. For example, since both English and Dutch both belong to indogermanische Fmilie, the result of the machine translation between these two languages is often much better than the result of the mutual machine translation between English and Chinese.
Therefore, in order to improve the result of the machine translation, manual intervention is still very important. For example, in some machine translation systems, by manually defining or choosing more suitable words, the accuracy and quality of the machine translation can be dramatically improved.
Some existing translation tools, such as Alta Vista Babelfish, sometimes can obtain understandable translation results. However, if a more meaningful result is desired, it is often necessary to make an appropriate edition when inputting a sentence in order to facilitate the analysis by computer programs.
In general, the purpose of using machine translation by people may only be learning the essence of sentences or paragraphs in an original text, rather than obtaining an accurate translation. Generally speaking, the machine translation has not reached a quality level such that it can be substituted for professional (manual) translation, and still cannot become an official translation.
Natural Language Processing (NLP) is a sub-discipline of the field of artificial intelligence and linguistics. In this field, how to process and apply a natural language is discussed; and natural language cognition refers to that a computer is made to “understand” the real meaning behind human languages.
A natural language generation system converts computer data to a natural language. A natural language understanding system converts a natural language to a form that can be more easily processed by computer programs.
In theory, the NLP is a very attractive way of human-computer interaction. Early language processing systems, such as SHRDLU, when using a limited vocabulary for making sessions within a limited “blocks world”, can work quite well. This makes the researchers fairly optimistic on this system. However, when the systems are developed to be located in an environment filled with real-world ambiguity and uncertainty, they quickly lost confidence. Since the understanding of a natural language requires for the extensive knowledge about the outside world and the ability to use or manipulate the knowledge, the natural language cognition is also regarded as an AI-Complete problem.
The statistic-based NLP utilizes probabilistic and statistical methods to solve the problems existing in the NLP based on grammar rules. Especially for long sentences prone to be highly ambiguous, when practical grammar is applied for analysis, thousands of possibilities may be produced. The disambiguation methods adopted for processing these highly ambiguous sentences often utilize corpora and Markov models. The statistic-based NLP technology is mainly developed by evolution from the sub-fields, namely Machine Learning and Data Mining, associated with learning behavior in the artificial intelligence technology.
However, for the statistic-based NLP method, a corpus of paired language corpora containing a large amount of data needs to be established for the learning and use of a computer, and for the corpus of a large amount of data, retrieving of a corresponding result of machine translation (understanding) from the corpus and feeding back the result also require for the support of a large amount of computing resources. In addition, even if this method is adopted, great difficulties still exist in dealing with the diversity and uncertainty of the practical natural language.
The NLP technology has been widely used in practice. For example, it is used in an interactive voice response system, an internet call center system, and so on.
Interactive Voice Response (IVR) is a general term of telephone-based voice value-added services. Many institutions (such as banks, credit card centers, telecom operators, etc.) provide customers with a wide range of self-services through an Interactive Voice Response System (IVRS), in which a customer may dial a specified phone number to log into the system, and enter appropriate options or personal information according to the instruction of the system, so as to listen to the pre-recorded information, or combine data according to a preset program (Call Flow) through the computer system, and read out specific information (such as account balance, amount due, and so on) in the manner of speech, and may also input a transaction instruction through the system, so as to conduct a preset transaction (such as transfer, change of password, change of contact phone number, etc).
Despite the IVR system has been widely used over the past decade, but technically, the IVR system was born with a critical defect that is still troubling all institutions: an irreducible menu tree with multi-layer options. Most of the users, when using the IVR system to select the self-services, are impatient to take time to traverse a menu tree with multi-layer options, but directly turn to a manual customer service center by pressing “0”, leading to an insurmountable gap between the expectation of the institutions on the ability of the IVR system to “effectively improve the rate of using self-services by the customers and substantially replace the manual operations” and the reality.
An Internet Call Center System (ICCS) is a new type of call center system booming in recent years, which adopts a popular Instant Messaging (IM) Internet technique, for enabling the mainly text-based real-time communication to be performed by the institutions and customers thereof over the Internet, and is applied to the customer services and remote sales of the institutions. The manual agent employing the ICCS can communicate simultaneously with two or more customers.
So to speak, the text-based ICC system is a variant of the speech-based IVR system. Both are necessary tools (either for customer services or for remote sales) for the communication between the institutions and the customers thereof, and both require for the high level of participation of the manual agent. Therefore, like the IVR system, it is also difficult for the ICC system to meet the requirement of “effectively improving the rate of using self-services by the customers and substantially replacing the manual operations” of the institutions.
On the other hand, the traditional speech-identification technology, based on the speech identification result being lack of accuracy and stability, employs keyword search technology, and uses an “exhaustive method” to perform semantic analysis on the speech. Although many companies majored in speech-identification technology spend a great deal of human efforts and money on two items of work, i.e., “transcription” and “keyword spotting”, and persistently train a speech robot for a long time, but the actual effects are often far different from the ideal effects.