This invention relates to a pole-zero analyzer for use in deciding pole and zero parameters used collectively in approximating a spectrum of an input signal which is typically a speech signal.
It is important in speech analysis and synthesis to extract parameters from a speech signal. It becomes necessary depending on the circumstances to decide such parameters for a more general signal. Poles and zeros are often used as the parameters in connection with such an input signal. This is because the pole and zero parameters are clear in physical meanings and are convenient for application to synthesis of the signal and other applications.
For use in deciding the pole parameters, a method is described in Chapter 7 of a book written by J. D. Markel and A. H. Gray, Jr., under the title of "Linear Prediction of Speech" and published 1967 by Springer Verlag. According to Markel et al, the pole parameters are extracted from a speech signal by solving by approximation, such as the Newton-Raphson approximation, a higher-order algebraic equation in which coefficients are given by linear predictive encoding (LPC) of the speech signal.
This method gives an excellent result for formants of human voice. A great amount of calculation is, however, necessary on solving the algebraic equation by approximation. Furthermore, it is difficult to stably decide the frequency and the bandwidth of each pole parameter. In addition, no zero parameters are obtained. The pole parameters must therefore be calculated to a high order depending on the shape of spectrum of the input signal. This results in an increased amount of calculation.
Another method is for use also in deciding the pole parameters for a speech signal and is disclosed in a paper released Oct. 26, 1981, by Katsunobu Fushikida under the title of "A Focusing Formant Extraction Method Using Autocorreltaion Domain Inverse Filtering" in Japanese together with an abstract in English as "Nippon Onsei Gakukai Kenkyukai Siryo Bango S81-41" (Paper No. S81-41 of a Study Group of the Acoustical Society of Japan). According to the Fushikida paper, a pole parameter table is preliminarily formed to provide candiate pole parameters. Focussing is carried out towards each optimum pole parameter in a predetermined number of stages. In each stage, a preselected number of candidate pole parameters are selected as selected pole parameters. One of the selected pole parameters is used at first as a rough approximation of the optimum pole parameter. In the last stage, a close approximation gives the optimum pole parameter.
The method of the Fushikida paper is excellent in stably deciding each optimum pole parameter. A great deal of calculation is, however, necessary in carrying out the focussing. Moreover, only the pole parameters are obtained.
A method of deciding the pole parameters as well as the zero parameters is revealed in an article contributed by Clifford T. Mullis and Richard A. Roberts to IEEE Transactions on Acoustics, Speech, and Signal Processing, Volume ASSP-24, No. 3 (Jun. 1976), pages 226 through 238, under the title of "The Use of Second-Order Information in the Approximation of Discrete-Time Linear Systems." According to the Mullis et al article, spectrum envelope of a speech signal is approximated by a pole-zero model or system which has a transfer function represented by: ##EQU1## where a.sub.0 is equal to unity. The pole-zero model comprises up to an (n-1)-th order pole circuits and up to an (m-1)-th order zero circuits to produce a model output signal.
The method of the Mullis et al article is based on the fact that the model output signal best approximates the input signal when coefficients a.sub.i of the denominator polynomial and coefficients q.sub.k of the numerator polynomial have best values which minimizes the following quadratic form: ##EQU2## where H(exp[j.omega.]) represents the spectrum envelope of the speech signal. In Equation (2), the denominator of the transfer function is combined with the spectrum envelope in a first term of the integrand. The numerator of the transfer function is used as a second term of the integrand.
When viewed in a time domain, minimization of Equation (2) is equivalent to solving a set of simultaneous equations including coefficients a.sub.i of the denominator polynomial and coefficients q.sub.k of the numerator polynomial as unknowns for which coefficients are given by an autocorrelation sequence of the speech signal and by an impulse response related to the speech signal. The autocorrelation sequence and the impulse response are readily calculated by application of the method described in Chapter 7 of the above-referenced book of Markel et al. It is unnecessary in this event to solve a higher-order algebraic equation.
According to Mullis et al, the spectrum of the input signal is excellently approximated by the parameters of up to a relatively low order because the zero parameters are obtained as well. It is, however, necessary on deciding the pole and the zero parameters from the best values of the coefficients a.sub.i and q.sub.k to solve an n-th order algebraic equation and an m-th order algebraic equation. For a speech signal, the number of zero parameters is small. No problem therefore arises on solving the m-th order algebraic equation. In contrast, the pole parameters are necessary up to about fifteenth order. A considerable amount of calculation is necessary on solving the n-th order algebraic equation.
On the other hand, it is known to calculate an input cepstrum related to each time window, such as the Hamming window known in the art, of an input signal. The cepstrum reperesnts a spectrum envelope of the input signal by input cepstrum components or data of up to an order of a few scores.