Speech translation machines have been developed to facilitate communication between people of different languages. In order to audio-translate speech in a source language to a target language, the speech translation machine recognizes speech, translates a transcription of the speech acquired by the speech recognition, and converts the translated transcription into synthesized speech sounds of the target language. When people communicate, emotions (e.g., anger, sorrow, or joy) are usually included in a speaker's utterance in accordance with situations the speaker is experiencing as he or she speaks. The speaker's emotion can be conveyed to the listener by translating the emotion as well as the speech.
However, for smooth communication it is not always desirable to reflect the speaker's emotion in synthesized speech. For example, if the speaker's angry mood of the speech is conveyed to the listener through synthesized speech, it may cause an emotional collision between the speaker and the listener.