Conventionally, a system that transforms a voice characteristic so as to match the voice characteristic inputted for a speech element sequence selected by an element selection unit is proposed as a speech synthesis system capable of synthesizing speech and changing the voice characteristic of the synthesized sound (for example, Patent Reference 1).
FIG. 9 is a configuration diagram of a conventional voice characteristics variable speech synthesis device described in Patent Reference 1. The conventional voice characteristics variable speech synthesis device includes a text input unit 1, a voice characteristics transformation parameter input unit 2, an element storage unit 3, an element selection unit 4, a voice characteristics transformation unit 5, and a waveform synthesis unit 6.
The text input unit 1 is a processing unit that externally accepts phoneme information indicating a content of a word requested to be speech synthesized and prosody information indicating an accent and an intonation of an entire speech, and outputs them to the element selection unit 4.
The voice characteristics transformation parameter input unit 2 is a processing unit that accepts the input of a transformation parameter required for transformation to the voice characteristic desired by the editor. The element storage unit 3 is a storage unit that stores speech elements for various speeches. The element selection unit 4 is a processing unit that selects, from the element storage unit 3, the speech element sequence that most matches the phoneme information and the prosody information outputted from the text input unit 1.
The voice characteristics transformation unit 5 is a processing unit that transforms the speech element sequence selected by the element selection unit 4 into the voice characteristic desired by the editor, using the transformation parameter inputted by the voice characteristics transformation parameter input unit 2. The waveform synthesis unit 6 is a processing unit that synthesizes a speech waveform from the speech element with the voice characteristic which is transformed by the voice characteristics transformation unit 5.
Thus, in the conventional voice characteristics variable speech synthesis device, the voice characteristics transformation unit 5 transforms the speech element sequence selected by the element selection unit 4 using the speech transformation parameter inputted by the voice characteristics transformation parameter input unit 2 to obtain a synthesized sound of the voice characteristic desired by the editor.
In addition, a method of performing voice characteristics variable speech synthesis by preparing a plurality of speech element databases for each voice characteristic, and selectively using the speech element database that most matches the inputted voice characteristic is known.    Patent Reference 1: Japanese Laid-Open Patent Application No. 2003-66982 (pp. 1-10, FIG. 1)