The present disclosure relates generally to text-to-speech (TTS) systems, and, in particular, to a system and method for selecting among TTS systems dynamically.
The quality of the output of a text-to-speech synthesis system is dependent on the particular text presented as input; some sentences synthesize well, while others are plagued by discontinuities and bad prosody. Moreover, systems using different algorithms or different settings may behave differently on a given text. One system may perform better than another system on some texts, but worse on others. Typically, a TTS system uses a particular algorithm and system, and adjusts the parameters related to that algorithm and system.