This invention relates to a speech analysis method applicable to a speech analysis/synthesis system employed for producing a synthetic sound.
The human auditory sense is a kind of a spectrum analyzer and has such characteristics that, if the power spectrum of plural sounds is the same, the sounds are heard plural as the recognized same sound. These characteristics are utilized in producing the sound by the speech analysis/synthesis method.
For producing synthetic speech, input signals are analyzed by a speech analyzer to extract or detect pitch data, voiced/unvoiced decision data, amplitude data, etc., and the sound is artificially produced by a speech synthesizer based on these data. Above all, the speech synthesis system is classified, according to the method of synthesis, into a speech editing system, parametric synthesis system and a rule synthesizing system.
With the speech editing system, the waveform of a speech of a man is stored or recorded directly or after encoding into a waveform, with words or paragraphs as units, so as to be read out and edited by suitable interconnection to synthesize speech whenever necessity arises.
With the parametric synthesis system, the waveform of a speech of a man is previously analyzed, with the words or paragraphs as units, as in the case of the speech editing system, based on a speech synthesis model, so as to be stored in the form of a time sequence of parameters, and a speech synthesizer is driven, whenever necessity arises, using the time sequence of interconnected parameters, for synthesizing speech. Finally, with the rule synthesis method, a series of speech signals, expressed as discrete symbols such as letters or speech symbols, are converted continuously. During the process of conversion, generally applicable properties and artificial properties of speech synthesis are utilized as the rules of synthesis.
The above recited synthesis systems simulate the acoustic canal in some form or other to produce synthetic sound using signals having substantially the same characteristics as those of the source sound wave.
Up to now, in achieving high-quality control in speech analysis/synthesis, a residual-driving type analysis/synthesis system has frequently been utilized. However, the residual driving type synthesis/analysis system is not satisfactory in separating sound source information from auditory canal information and hence is subject to spectral distortion at the time of pitch change to lead to deterioration of the synthetic sound.