Speech recognition techniques translate an acoustic signal into a computer-readable format. Speech recognition systems have been used for various applications, including data entry applications that allow a user to dictate desired information to a computer device, security applications that restrict access to a particular device or secure facility, and speech-to-speech translation applications, where a spoken phrase is translated from a source language into one or more target languages. In a speech-to-speech translation application, the speech recognition system translates the acoustic signal into a computer-readable format, and a machine translator reproduces the spoken phrase in the desired language.
Multilingual speech-to-speech translation has typically required the participation of a human translator to translate a conversation from a source language into one or more target languages. For example, telephone service providers, such as AT&T Corporation, often provide human operators that perform language translation services. With the advances in the underlying speech recognition technology, however, automated speech-to-speech translation may now be performed without requiring a human translator. Automated multilingual speech-to-speech translation systems will provide multilingual speech recognition for interactions between individuals and computer devices. In addition, such automated multilingual speech-to-speech translation systems can also provide translation services for conversations between two individuals.
A number of systems have been proposed or suggested that attempt to perform speech-to-speech translation. For example, Alex Waibel, “Interactive Translation of Conversational Speech”, Computer, 29(7), 41-48 (1996), hereinafter referred to as the “Janus II System,” discloses a computer-aided speech translation system. The Janus II speech translation system operates on spontaneous conversational speech between humans. While the Janus II System performs effectively for a number of applications, it suffers from a number of limitations, which if overcome, could greatly expand the accuracy and efficiency of such speech-to-speech translation systems. For example, the Janus II System does not synchronize the original source language speech and the translated target language speech.
A need therefore exists for improved methods and apparatus that perform automated speech translation. A further need exists for methods and apparatus for synchronizing the original source language speech and the translated target language speech in a speech-to-speech translation system. Yet another need exists for speech-to-speech translation methods and apparatus that automatically translate the original source language speech into a number of desired target languages.