The present invention claims priority to Japanese Patent Application No.9-072550 filed Mar. 26, 1997, which is incorporated herein by reference.
1. Field of the Invention
The present invention relates to a coding and decoding system for speech and musical sound and, particularly, to a coding and decoding system for speech and musical sound in a telephone-bandwidth.
2. Description of Related Art
A coder for coding speech at low bit rate to make sound quality thereof high, which utilizes the Code Excited Linear Prediction Coding (CELP) system, has been known. The CELP system itself is described in detail in, for example, "Code-Excited Linear Prediction: High Quality Speech at Very Low Bit Rates", IEEE Proc. ICASSP-85, pp. 937-940, 1985.
In the CELP system, the coding is performed by using frame characteristic parameters obtained from every frame (for example, 40 msec) of a speech signal and sub-frame characteristic parameters obtained from every subframe (for example, 8 msec) obtained by dividing the frame by 5 in this example. The frame characteristic parameters include coefficients of a linear prediction (LP) synthesis filter, indicative of a coarse spectrums. The sub-frame characteristic parameters include a lag of a pitch linear prediction synthesis filter indicative of a fine spectrum of such as pitch period, a code vector indicative of a residual signal of the pitch linear prediction filter and a gain of the code vector, etc. The code vector is preliminarily produced on the basis of a signal to be practically coded and a random number, etc.
On the other hand, in a case where musical sound is coded and decoded according to the CELP system, sound quality of a coded sound is degraded by the pitch linear prediction filter and the code vector which are indicative of a periodic structure of musical sound since the spectral structure of the musical sound is complex. In order to solve this problem, a coding and decoding system which uses a high order linear prediction filter in lieu of the pitch linear prediction filter has been proposed.
Linear prediction coefficients used in the high order linear prediction filter are calculated by using a reproduced signal decoded by the past sub-frames. Therefore, this filter is known as a backward linear prediction filter. In order to calculate the coefficients of the backward linear prediction filter, the reproduced signal decoded up to a sub-frame preceding a current sub-frame is first analyzed by linear prediction at a low order. Then, the residual signal of the reproduced signal is obtained by using an inverse filter constructed with the linear prediction coefficients obtained by this analysis to remove the coarse spectrum is flattened, of the reproduced signal. Since the spectrum except for its fine configurations, the inverse filter and circuits subsequent thereto are called a flattening linear prediction filter.
The backward linear prediction coefficients are obtained by a linear prediction analysis of the residual signal at high order. This coding and decoding system is disclosed in for example, S. Sasaki et al., "Improved CELP Coding for Audio Signal," Proc. Acoustical Society of Japan, 1-4-23, pp. 263-264 (March 1996) and an example of the backward linear prediction is disclosed in: "A Low-Delay CELP Coder for CCITT 16 kb/s Speech Coding Standard", IEEE Journal on Selected Areas in Communications", Vol. 10, No. 5, June, 1992.
An operation of the conventional coding and decoding system will be described with reference to FIGS. 1 to 3.
FIG. 1 is a block diagram showing an example of the conventional coding device. In FIG. 1, a signal to be coded is input to an input terminal 1. A frame division circuit (FD) 2 produces frame signals by dividing the input signal into frame signals having a predetermined frame length.
A signal processing in a frame unit will be described first. A sub-frame division circuit (SFD) 6 produces sub-frame signals by dividing a frame signal into sub-frames having a predetermined sub-frame length. A linear prediction analyzer (LPA) 3 produces linear prediction coefficients by a linear prediction analysis of the frame signal. A filter coefficient quantizer (FCQ) 4 produces quantized linear prediction coefficients and a filter coefficient quantizing index by quantizing the linear prediction coefficients.
A filter coefficient interpolation circuit (FCI) 5 produces interpolated quantized linear prediction coefficients a to be used in the respective sub-frames by interpolating the quantizing linear prediction coefficients obtained from the past frames and the quantizing linear prediction coefficients of the current frame. A filter coefficient interpolation circuit (FCI) 7 produces interpolated linear prediction coefficients w to be used in the respective sub-frames by interpolating the linear prediction coefficients obtained from the past frames and the linear prediction coefficients obtained for the current frame.
Now, a signal processing in each sub-frame unit will be described. A backward analyzer (BWA) 34 accumulates the reproduced signals supplied from a synthesizing filter (SYNTH) 22 for the past sub-frames and calculates backward linear prediction coefficients b indicative of a fine spectral distribution from the accumulated, reproduced signal. A weighting filter (WEIGHT) 25 produces a weighted sub-frame signal without noise by filtering the sub-frame signal using a filter constructed with the interpolated linear prediction coefficients w.
An excitation code book circuit (ECB) 16 accumulates a plurality of code vectors each of sub-frame length, that is, waveform patterns, preliminarily produced from random numbers, etc., and outputs the code vectors (the waveform patterns) sequentially according to the index supplied from an error evaluation circuit (ERR) 35. A predetermined number of code vectors having corresponding indices are preliminarily prepared.
A gain code book circuit (GCB) 32 includes a table (not shown) containing gain values for regulating amplitudes of the code vectors and outputs the gain values according to the indices supplied from the error evaluation circuit 35. A predetermined number of the gain values are prepared and have the indices corresponding thereto, respectively. A multiplier 18 produces a code vector excitation candidate signal by multiplying the code vector output from the excitation code book circuit 16 with the gain value of the code vector output from the gain code book circuit (GCB) 32.
A backward filter (BWF) 10 obtains a reproduced excitation candidate signal by filtering the code vector excitation candidate signal using a filter constructed with the backward linear prediction coefficients b supplied from the backward analyzer 34. A synthesizing filter (SYNTH) 11 obtains a reproduced candidate signal by filtering the reproduced excitation candidate signal from the backward filter 10 using a filter constructed with the quantizing linear prediction coefficients a indicative of the coarse spectral distribution. A weighting filter (WEIGHT) 12 obtains a weighted, reproduced candidate signal having no noise by filtering the reproduced candidate signal using a filter constructed with the interpolated linear prediction coefficients w.
A difference circuit 13 subtracts the weighted reproduced candidate signal from the weighted sub-frame signal and obtains a difference signal. The error evaluation circuit 35 supplies the indices to the excitation code book circuit 16 and the gain code book circuit 32 sequentially correspondingly thereto and calculates a square sum of the difference signal calculated by the difference circuit 13 for every combination of the code vector and the gain value corresponding to the index supplied thereto.
In performing this calculation sequentially, the error evaluation circuit 35 supplies an update flag to a gate circuit 33 when a smaller square sum is found. Further, after square sums for all combinations are calculated, the error evaluation circuit 35 selects an index corresponding to the code vector and the gain value whose square sum is minimum and sends it to an multiplexer (MUX)36 as a excitation quantizing index.
The gate circuit 33 replaces the code vector excitation candidate signal stored therein with a code vector excitation candidate signal output from the multiplier 18 only when the error evaluation circuit 35 supplies the update flag thereto. Further, after the calculation of the square sums for all of the combinations is completed in the error evaluation circuit 35, the gate circuit 33 outputs the stored code vector excitation candidate signal as a reproduced excitation signal.
A backward filter (BWF) 21 produces a reproduced excitation signal by filtering the reproduced excitation signal output from the gate circuit 33 using a filter constructed with the backward linear prediction coefficients b. A synthesizing filter 22 produces a reproduced signal by filtering the reproduced excitation signal using a filter constructed with the interpolated quantized linear prediction coefficients a and supplies it to the backward analyzer 34. This reproduced signal is a decoded signal corresponding to the input signal.
The multiplexer 36 outputs a transmission data obtained by multiplexing the filter coefficients quantizing index output from the filter coefficient quantizer 4 with the excitation quantizing index output from the error evaluation circuit 35 to an output terminal 24.
FIG. 2 is a block diagram showing an example of a construction of the backward analyzer 34. In FIG. 2, a signal processing portion of the backward analyzer 34, which includes a window processing circuit (WIN) 34b, a correlation calculator (CORR) 34c and a Levinson Durbin circuit (LD) 34d, and another signal processing portion thereof which includes a window processing circuit (WIN) 34f, a correlation calculator circuit (CORR) 34g and a Levinson Durbin circuit (LD) 34h realizes a linear prediction analysis method utilizing an auto-correlation method. Although only the auto-correlation method is described in this specification, such method may be replaced by other linear prediction analysis method.
The linear prediction analysis itself is described in detail in, for example, J. R. Deller, "Discrete-Time Processing of Speech Signals", Macmillan Pub., 1993.
The construction of the backward analyzer 34 will be described with reference to FIG. 2. The window processing circuit 34b performs an analysis windowing of the reproduced signal input to an input terminal 34a. The correlation calculator 3 4c calculates a first auto-correlation value from the windowed signal. The Levinson Durbin circuit 34d calculates flattening linear prediction coefficients for flattening the spectrum from the first auto-correlation value. An inverse filter (INV) 34e produces a predicted residual signal of the reproduced signal by using a flattening linear prediction filter constituting the flattening linear prediction coefficients.
The window processing circuit 34f performs an analysis windowing of the predicted residual signal. The auto-correlation calculator 34g calculates a second auto-correlation value from the windowed predicted residual signal. The Levinson Durbin circuit 34h calculates the backward linear prediction coefficients b from the second auto-correlation value and outputs them to an output terminal 34i.
FIG. 3 is a block diagram showing an example of the conventional decoder device. A demultiplexer (DEMUX) 37 produces an index corresponding to linear prediction coefficients, a code vector and its gain value by using the transmission data input from the input terminal 26. A filter coefficient decoder (FCD) 38 decodes the quantizing linear prediction coefficients from the index of the linear prediction coefficients. The filter coefficient interpolation circuit 5 produces the interpolated quantized linear prediction coefficients a to be used in the respective sub-frames, by interpolating the decoded quantizing linear prediction coefficients and the quantizing linear prediction coefficients decoded in a preceding frame.
The excitation code book circuit 16 outputs a code vector according to the index of the code vector. The gain code book circuit 32 outputs a gain value according to the index of gain value. The multiplier 18 produces a first reproduced excitation signal by multiplying the code vector with the gain value. The backward analyzer 34 accumulates the reproduced signals supplied from the synthesizing filter 11 in the past frames and calculates the backward linear prediction coefficients b from the stored, reproduced signals.
The backward filter 10 produces a second reproduced excitation signal by filtering the first reproduced excitation signal using a filter constructed with the backward linear prediction coefficients b. The synthesis filter 11 produces the reproduced signal by filtering the second reproduced excitation signal using a filter constructed with the interpolated quantized linear prediction coefficients a. The reproduced signal is output from an output terminal 29.
In the conventional speech coding and decoding device mentioned above, the periodic structure of the input speech signal is obtained by using only the backward linear prediction filter, which is not based on the speech signal producing model. Therefore, the coding performance thereof with respect to a speech signal is low.
Further, in the conventional speech coding and decoding device, the backward linear prediction coefficients are calculated by the linear prediction analysis of the reproduced signal whose spectrum is flattened. Therefore, a large amount of arithmetic operation is required.