This invention relates to a method of extracting features from an input signal by linear predictive analysis.
Feature extraction methods are used to analyze acoustic signals for purposes ranging from speech recognition to the diagnosis of malfunctioning motors and engines. The acoustic signal is converted to an electrical input signal that is sampled, digitized, and divided into fixed-length frames of short duration. Each frame thus consists of N sample values x.sub.1, x.sub.2, . . . , x.sub.N. The sample values are mathematically analyzed to extract numerical quantities, called features, which characterize the frame. The features are provided as raw material to a higher-level process. In a speech recognition or engine diagnosis system, for example, the features may be compared with a standard library of features to identify phonemes of speech, or sounds symptomatic of specific engine problems.
One group of mathematical techniques used for feature extraction can be represented by linear predictive analysis (LPA). Linear predictive analysis uses a model which assumes that each sample value can be predicted from the preceding p sample values by an equation of the form: EQU x.sub.n =-(a.sub.1 x.sub.n-1 +a.sub.2 x.sub.n-2 + . . . +a.sub.p x.sub.n-p)
The integer p is referred to as the order of the model. The analysis consists in finding the set of coefficients a.sub.1, a.sub.2, . . . , a.sub.p that gives the best predictions over the entire frame. These coefficients are output as features of the frame. Other techniques in this general group include PARCOR (partial correlation) analysis, zero-crossing count analysis, energy analysis, and autocorrelation function analysis.
Another general group of techniques employes the order p of the above model as a feature. Models of increasing order are tested until a model that satisfies some criterion is found, and its order p is output as a feature of the frame. The models are generally tested using the maximum-likelihood estimator .sigma..sub.p.sup.2 of their mean square residual error .sigma..sub.p.sup.2, also called the residual power or error power. Specific testing criteria that have been proposed include:
(1) Final predictive error (FPE) EQU FPE(p)=.sigma..sub.p.sup.2 (N+P+1)/(N-P-1)
(2) Akaike information criterion (AIC) EQU AIC(p)=1n(.sigma..sub.p.sup.2)+2(p+1)/N
(3) Criterion autoregressive transfer function (CAT) ##EQU1## where, .sigma..sub.j.sup.2 =[N/(N-j)].sigma..sub.p.sup.2. The order p found as a feature is related to the number of peaks in the power spectrum of the input signal.
A problem of all of these methods is that they do not provide useful feature information about short-duration input signals. The methods in the first group which use linear predictive coefficients, PARCOR coefficients, and the autocorrelation function require a stationary input signal: a signal long enough to exhibit constant properties over time. Short input signal frames are regarded as nonstationary random data and correct features are not derived. The zero-crossing counter and energy methods have large statistical variances and do not yield satisfactory features.
In the second group of methods, there is a tendency for the order p to become larger than necessary, reflecting spurious peaks. The reason is that the prior-art methods are based on logarithm-average maximum-likelihood estimation techniques which assume the existence of a precise value to which the estimate can converge. In actual input signals there is no assurance that such a value exists. In the AIC formula, for example, the accuracy of the estimate is severely degraded because the second term, which is proportional to the order, is too large in relation to the first term, which corresponds to the likelihood.