1. Field of the Invention
The present invention relates to speech splicing and, more particularly, to a system and method for phrase splicing and variable substitution of speech using a synthesizing device.
2. Description of the Related Art
Speech recognition systems are used in many areas today to transcribe speech into text. The success of this technology in simplifying man-machine interaction is stimulating the use of this technology into a plurality of useful applications, such as transcribing dictation, voicemail, home banking, directory assistance, etc. In particularly useful applications, it is often advantageous to provide synthetic speech generation as well.
Synthetic speech generation is typically performed by utterance playback or full text-to-speech (TTS) synthesis. Recorded utterances provide high speech quality and are typically best suited for applications where the number of sentences to be produced is very small and never changes. However, there are limits to the number of utterances which can be recorded. Expanding the range of recorded utterance systems by playing phrase and word recordings to construct sentences is possible, but does not produce fluent speech and can suffer from serious prosodic problems.
Text-to-speech systems may be used to generate arbitrary speech. They are desirable for some applications, for example where the text to be spoken cannot be known in advance, or where there is simply too much text to prerecord everything. However, speech generated by TTS systems tends to be both less intelligible and less natural than human speech.
Therefore, a need exists for a speech synthesis generation system which provides all the advantages of recorded utterances and text-to-speech synthesis. A further need exists for a system and method capable of blending pre-recorded speech with synthetic speech.