The present invention relates to speech processing and more particularly to a speech processing system using phonetic decoding and concatenative speech.
IT (Information Technology) developments now allow people to have voice conversations with each other on a global basis. Voice conversations between people in different geographies, even when nominally conducted in a common language (e.g., English), is complicated by the accents of people whose native language is different from the common language. Written communication is generally unaffected by these variations, but once people need to speak directly to each other, for example in call-center/helpdesk situations or conference calls, the difficulty in understanding each others' variants of the common language can make communication very difficult and frustrating.
Elocution lessons are hardly practicable for the whole population and would be extremely expensive.
Feeding the text output from an automatic speech recognizer (ASR) into a Text To Speech (TTS) engine is limited by the accuracy and vocabulary of the ASR and the lack of ability of the TTS system to reflect the speaking patterns of the subject.