Speech signal processing is well known in the art and is often utilized to compress an incoming speech signal for applications such as storage and transmission. The speech signal processing typically involves dividing the incoming speech signals into frames and then analyzing each frame to determine its representative components. The representative components are then stored or transmitted.
A frame analyzer is often used to determine the short-term and long-term characteristics of the speech signal. The frame analyzer can also determine one or both of the short- and long-term components, or contributions, of the speech signal. As an example, linear prediction coefficient (LPC) analysis provides the short-term characteristics and contribution, and pitch analysis and prediction provides the long-term characteristics as well as the long-term contribution.
Typically, one, both or neither of the long- and short-term predictor contributions are subtracted from the input frame, leaving a target vector whose shape has to be characterized. Such a characterization can be produced with multi-pulse analysis (MPA) which is described in detail in section 6.4.2 of the book Digital Speech Processing, Synthesis and Recognition by Sadaoki Furni, Marcel Dekker, Inc., New York, N.Y. 1989, incorporated herein by reference.
Conventionally, MPA involves a target vector that is formed of a multiplicity of samples. The target vector is modeled by a plurality of pulses of equal amplitude varying in location and varying in sign (positive and negative). To select each pulse, a pulse is placed at each sample location and the effect of the pulse, defined by passing the pulse through a filter defined by the LPC coefficients, is determined. The pulse which provides the filter output that most closely matches the target vector is selected and its effect is removed from the target vector, thereby generating a new target vector. The process continues until a predetermined number of pulses have been found. For storage or transmission purposes, the result of the MPA analysis is a collection of pulse locations, pulse signs (positive or negative), and a quantized value of the pulse amplitude.
The MPA output typically specifies the resulting pulse locations, but not the order in which they were chosen. It also specifies only one gain parameter, so the decoder must reconstruct the pulse sequence using equal amplitudes for all the pulses. In addition, the MPA analysis itself is sub-optimal, from a maximum-likelihood standpoint, with respect to determining the best possible pulse sequence to match the target.
Accordingly, there is need for a speech processor and method that improves the performance of the MPA process and the perceptual quality of the reconstructed speech and that overcomes the above-mentioned deficiencies of the prior art.