1. Field of the invention
The present invention relates to speech processing. More particularly, it relates to a speech translation method and apparatus that preserves speech nuances that are provided by non-text information.
2. Description of Related Art
Current speech-to-speech machine translation processes first change speech into text. They then translate the text into the text of a target language. Finally, they synthesize the text of the target language into target speech by using speech synthesis technology.
However, speech contains information which is far richer than text information. Examples are emotional expressions like laughter and sighs, and prosodic information like stress, intonation, duration, pitch and energy of a speech unit like each character or syllable. Such non text information is very helpful for understanding the real meaning of the speaker. However, the speech synthesized by the speech synthesis technology only relies on the translated text information, and much information behind the text has been lost.