1. Field of the Invention
The present invention relates to a speech synthesis apparatus, a speech synthesis method, a speech synthesis program, a portable information terminal, and a speech synthesis system that are desirable in a case where various effects are added to, for example, speech that is converted from text data.
2. Description of the Related Art
As one of functions realized by a personal computer or a game machine, there is a function of outputting a speech signal from a speaker, the speech signal being converted from text data. This function is a so-called reading-aloud function.
There are roughly two types of methods for performing text-to-speech conversion used in this reading-aloud function.
One of the two types of methods is speech synthesis by filing and editing, and the other is speech synthesis by rule.
The speech synthesis by filing and editing is a method for synthesizing a desired word, sentence, or the like by performing editing such as combination of pre-recorded speech items such as words or the like uttered by a human. Here, in the speech synthesis by filing and editing, although the resulting speech sounds natural and is close to human speech, since desired words, sentences, and the like are generated by combining pre-recorded speech items, it may not be possible to generate some words or sentences using the pre-recorded speech items. Moreover, for example, when this speech synthesis by filing and editing is applied to a case in which some fictional characters read text aloud, a plurality of sets of speech data of different timbres (voice timbres) as many as the number of the fictional characters are necessary. In particular, for a high-quality timbre, for example, additional speech data of 600 MB per fictional character is necessary.
In contrast, the speech synthesis by rule is a method for synthesizing speech by combining elements such as “phonemes” and “syllables” constituting speech. The degree of freedom of this speech synthesis by rule is high since elements such as “phonemes” and “syllables” can be freely combined. Moreover, since pre-recorded speech data to be material is not necessary, for example, this speech synthesis by rule is suitable for a speech synthesis function for an application installed onto a device whose built-in memory is not sufficiently large such as a portable information terminal. Here, compared with the above-described speech synthesis by filing and editing, synthesized speech obtained by means of the speech synthesis by rule tends to be machine-voice-like speech.
In addition, for example, Japanese Unexamined Patent Application Publication No. 2001-51688 discloses an e-mail reading-aloud apparatus using speech synthesis in which speech corresponding to text of an e-mail message is synthesized using text information concerning the e-mail message, music and sound effects are added to the synthesized speech, and resulting synthesized speech is output.
Moreover, for example, Japanese Unexamined Patent Application Publication No. 2002-354111 discloses a speech-signal synthesis apparatus and the like that synthesize speech input from a microphone and background music (BGM) played back from a BGM recording unit and output a resulting speech signal from a speaker or the like.
Moreover, for example, Japanese Unexamined Patent Application Publication No. 2005-106905 discloses a speech output system and the like that convert text data included in an e-mail message or a website into speech data, convert the speech data into a speech signal, and output the speech signal from a speaker or the like.
Moreover, for example, Japanese Unexamined Patent Application Publication No. 2003-223181 discloses a text-to-speech conversion apparatus and the like that divide text data into pictographic-character data and other character data, convert the pictographic-character data into intonation control data, convert the other character data into a speech signal having intonation based on the intonation control data, and output the speech signal from a speaker or the like.
Moreover, Japanese Unexamined Patent Application Publication No. 2007-293277 discloses an RSS content management method and the like that extract text from RSS content and convert the text into speech.