1. Field of the Invention
The present invention relates to improvements to an apparatus provided for speech synthesis of text by means of a regular synthetic method, and more particularly, to improvements in an apparatus for speech synthesis in which the accent of text data is controlled by an accent control method.
2. Description of the Prior Art
The automatic conversion of text to synthetic speech is commonly known as text to speech conversion or text to speech synthesis. A number of different techniques have been developed to make speech synthesis apparatus practical on a commercial basis. FIG. 5 shows a typical speech synthesis apparatus in which speech is synthesized by a regular synthetic method such as by using a connection rule of mora or rule of phoneme. The speech synthesis apparatus includes an accent control section 13 where a phrase pattern calculating section 13a is arranged to calculate a phrase component (which indicates the height of the voice in the part sandwiched between pauses) according to the number of mora contained in the text, and an accent pattern calculating section 13b is arranged to calculate an accent component (which shows the height of the sound of each word). The phrase component and the accent component are added to each other in a speech synthesizing section 14, and an accent control pattern is calculated as shown in FIG. 6. In general, the phrase component is continuously changed from a high pitch to a low pitch due to the lowering of the pressure under the glottis. The interpolation of the accent component is carried out by putting a pitch target value to each analysis element and linearly interpolating between pitches, or by putting three pitch target values to each analysis element and linearly interpolating among their pitches.
With the above mentioned accent control method in the speech synthesizer, an accent is applied to the synthesized speech by calculating the phrase component and the accent component. The accent component is determined by applying plural target pitches to each mora and linearly interpolating among their pitches.
However, since the pitch of the accent component is simply determined according to the height of the accent, the synthesized speech sounds mechanical due to its uniform change in the pitch. Further, since the interconnections between the syllables, and between the clauses are not taken into consideration, it is apt to cause unsmoothness in the height change of the accent and between moras. Accordingly, the synthesized speech generated by this method sounds unnatural.
In order to solve the above mentioned problem, another accent control method has been proposed, in which the changing coefficient of the pitch in the mora is determined by the linear function calculation according to the accent environment, in detail, according to the height of accent, the position in phrase, cotinuative phoneme or not, the accent height of forward and back mora of the mora, positional relationship with clause, and the target value at forward and back in the mora.
With such an accent control method, improved synthesized speech is provided. However, it is difficult to easily understand the changed accent pattern during the maintenance or when the a variable number is defined since the changing coefficient includes the variable number for controlling. This difficulty is further increased in proportion to the increase in the accent pattern. Furthermore, the calculating operations become more complicated since the function for generating the accent pattern and the defining of the variable number become complex.