In mobile communications, packet communications (e.g., internet communications) or speech storage, speech coding apparatuses are used for compressing speech information by using efficient encoding. This is for effective use of the capacity of transmission layer resources like radio frequencies or the capacity of storage media. Among those, systems based on the CELP (Code Excited Linear Prediction) system are carried into practice widely at medium and low bit rates. Techniques of CELP are described in M. R. Schroeder and B. S. Atal: “Code-Excited Linear Prediction (CELP): High-quality Speech at Very Low Bit Rates”, Proc. ICASSP-85, 25.1.1, pages 937-940, 1985.
According to the CELP speech coding system, speech is divided into frames of a certain length (about 5 ms to 50 ms), linear prediction analysis is performed for each frame, and the prediction residual (i.e. excitation signal) from the linear prediction analysis is encoded using an adaptive code vector and a fixed code vector having the shapes of prescribed waveforms. The adaptive code vector is selected from an adaptive codebook that stores excitation vectors produced earlier. The fixed code vector is selected from a fixed codebook that stores a prescribed number of vectors of prescribed shapes. The fixed code vectors stored in the fixed codebook include random vectors and vectors produced by combining several pulses.
A prior-art CELP coding apparatus performs LPC (Liner Predictive Coefficient) analysis and quantization, pitch search, fixed codebook search and gain codebook search, using input digital signals, and transmits the LPC code (L), pitch period (A), fixed codebook index (F) and gain codebook index (G), to the decoding apparatus.
The decoding apparatus decodes the LPC code (L), pitch period (A), fixed codebook index (F) and gain codebook index (G), and, based on the decoding results, applies an excitation signal to a synthesis filter and produces the decoded signal.
However, with the prior-art speech decoding apparatus, it is difficult to distinguish signals that are stationary but are not noisy (e.g. stationary vowels) from stationary noise and identify a stationary noise period.