This invention relates to an apparatus for encoding and an apparatus for decoding speech and musical signals. More particularly, the invention relates to a coding apparatus and a decoding apparatus for transmitting speech and musical signals at a low bit rate.
A method of encoding a speech signal by separating the speech signal into a linear prediction filter and its driving sound source signal is used widely as a method of encoding a speech signal efficiently at medium to low bit rates.
One such method that is typical is CELP (Code-Excited Linear Prediction). With CELP, a linear prediction filter for which linear prediction coefficients obtained by subjecting input speech to linear prediction analysis have been decided is driven by a sound source signal represented by the sum of a signal that represents the speech pitch period and a noise signal, whereby there is obtained a synthesized speech signal (i.e., a reconstructed signal). For a discussion of CELP, see the paper (referred to as xe2x80x9cReference 1xe2x80x9d) xe2x80x9cCode excited linear prediction: High quality speech at very low bit ratesxe2x80x9d by M. Schroeder et. al (Proc. ICASSP, pp. 937-940, 1985).
A method using a higher-order linear prediction filter representing the complicated spectrum of music is known as a method of improving music encoding performance by CELP. According to this method, the coefficients of a higher-order linear prediction filter are found by applying linear prediction analysis at a high order of from 50 to 100 to a signal obtained by inverse filtering a past reconstructed signal using a linear prediction filter. A signal obtained by inputting a musical signal to the higher-order linear prediction filter is applied to a linear prediction filter to obtain the reconstructed signal.
As an example of an apparatus for encoding speech and musical signals using a higher-order prediction linear filter, see the paper (referred to as xe2x80x9cReference 2xe2x80x9d) xe2x80x9cImproving the Quality of Musical Signals in CELP Codingxe2x80x9d, by Sasaki et al. (Acoustical Society of Japan, Spring, 1996 Meeting for Reading Research Papers, Collected Papers, pp. 263-264, 1996) and the paper (referred to as xe2x80x9cReference 3xe2x80x9d) xe2x80x9cA 16 Kbit/s Wideband CELP Coder with a High-Order Backward Predictor and its Fast Coefficient Calculationxe2x80x9d by M Serizawa et al. (IEEE Workshop on Speech Coding for Telecommunications, pp. 107-108, 1997).
A known method of encoding a sound source signal in CELP involves expressing a sound source signal efficiently by a multi pulse signal comprising a plurality of pulses and defined by the positions of the pulses and pulse amplitudes.
For a discussion of encoding of a sound source signal using a multipulse signal, see the paper (referred to as xe2x80x9cReference 4xe2x80x9d) xe2x80x9cMP-CELP Speech Coding based on Multi-Pulse Vector Quantization and Fast Searchxe2x80x9d by Ozawa et al. (Transaction A, Institute of Electronics, Information and Communication Engineers of Japan (Trans. IEICEJ), pp. 1655-1663, 1996). Further, by adopting a band splitting arrangement using a sound source signal found for each band and a higher-order backward linear prediction filter in an apparatus for encoding speech and musical signals based upon CELP, the ability to encode music is improved.
With regard to CELP using band splitting, see the paper (referred to as xe2x80x9cReference 5xe2x80x9d) xe2x80x9cMulti-band CELP Coding of Speech and Musicxe2x80x9d by A. Ubale et al. (IEEE Workshop on Speech Coding for Telecommunications, pp. 101-102, 1997).
FIG. 10 is a block diagram showing an example of the construction of an apparatus for encoding speech and music according to the prior art. For the sake of simplicity, it is assumed here that the number of bands is two.
As shown in FIG. 10, an input signal (input vector) enters from an input terminal 10. The input signal is generated by sampling a speech or musical signal and gathering a plurality of the samples into a single vector as one frame.
A first linear prediction coefficient calculation circuit 140 receives the input vector as an input from the input terminal 10. This circuit subjects the input vector to linear prediction analysis, obtains a linear prediction coefficient and quantizes the coefficient. The first linear prediction coefficient calculation circuit 140 outputs the linear prediction coefficient to a weighting filter 160 and outputs an index, which corresponds to a quantized value of the linear prediction coefficient, to a linear prediction filter 150 and to a code output circuit 690.
A known method of quantizing a linear prediction coefficient involves converting the coefficient to a line spectrum pair (referred to as an xe2x80x9cLSPxe2x80x9d) to effect quantization. For a discussion of the conversion of a linear prediction coefficient to an LSP, see the paper (referred to as xe2x80x9cReference 6xe2x80x9d) xe2x80x9cSpeech Data Compression by LSP Speech Analysis-Synthesis Techniquexe2x80x9d by Sugamura et al. (Transaction A, Institute of Electronics, Information and Communication Engineers of Japan (Trans. IEICEJ), Vol. J64-A, No. 8, pp. 599-606, 1981). In regard to quantization of an LSP, see the paper (referred to as xe2x80x9cReference 7xe2x80x9d) xe2x80x9cVector Quantization of LSP Parameters Using Moving Average Interframe Predictionxe2x80x9d by Omuro et al. (Transaction A, Institute of Electronics, Information and Communication Engineers of Japan (Trans. IEICEJ), Vol. J77-A, No. 3, pp. 303-312, 1994).
A first pulse position generating circuit 610 receives as an input an index that is output by a minimizing circuit 670, generates a first pulse position vector using the position of each pulse specified by the index and outputs this vector to a first sound source generating circuit 20.
Let M represent the number of pulses and let P1, P2, . . . , PM represent the positions of the pulses. The vector P, therefore, is written as follows:
=(Pxe2x88x921, P2, . . . , PM)
(It should be noted that the bar over P indicates that P is a vector.)
A first pulse amplitude generating circuit 120 has a table in which M-dimensional vectors Axe2x88x92j, j=1, . . . , NA have been stored, where NA represents the size of the table. The index output by the minimizing circuit 670 enters the first pulse amplitude generating circuit 120, which proceeds to read an M-dimensional vector Axe2x88x92i corresponding to this index out of the above-mentioned table and outputs this vector to the first sound source generating circuit 20 as a first pulse amplitude vector.
Letting Ai1, Ai2, . . . , AiM represent the amplitude values of the pulses, we have
Axe2x88x92i=(Ai1, Ai2, . . . , AiM)
A second pulse position generating circuit 611 receives as an input the index that is output by the minimizing circuit 670, generates a second pulse position vector using the position of each pulse specified by the index and outputs this vector to a second sound source generating circuit 21.
A second pulse amplitude generating circuit 121 has a table in which M-dimensional vectors Bxe2x88x92j, j=1, . . . , NB have been stored, where NB represents the size of the table.
The index output by the minimizing circuit 670 enters the second pulse amplitude generating circuit 121, which proceeds to read an M-dimensional vector Bxe2x88x92j corresponding to this index out of the above-mentioned table and outputs this vector to the second sound source generating circuit 21 as a second pulse amplitude vector.
The first pulse position vector Pxe2x88x92=(P1, P2, PM) output by the first pulse position generating circuit 610 and the first pulse amplitude vector Axe2x88x92i=(Ai1, Ai2, . . . , AiM) output by the first pulse amplitude generating circuit 120 enter the first sound source generating circuit 20. The first sound source generating circuit 20 outputs an N-dimensional vector for which the values of the P1st, P2nd, . . . , PMth elements are Ai1, Ai2, . . . , AiM, respectively, and the values of the other elements are zero to a first gain circuit 30 as a first sound source signal (sound source vector).
A second pulse position vector Qxe2x88x92=(Q1, Q2, . . . , Q M) output by the second pulse position generating circuit 611 and a second pulse amplitude vector Bxe2x88x92=(Bi1, Bi2, . . . , BiM) output by the second pulse amplitude generating circuit 121 enter the second sound source generating circuit 21. The second sound source generating circuit 21 outputs an N-dimensional vector for which the values of the Q1st, Q2nd, . . . , QMth elements are Bi1, Bi2, . . . , BiM, respectively, and the values of the other elements are zero to a second gain circuit 31 as a second sound source signal.
The first gain circuit 30 has a table in which gain values have been stored. The index output by the minimizing circuit 670 and the first sound source vector output by the first sound source generating circuit 20 enter the first gain circuit 30, which proceeds to read a first gain corresponding to the index out of the table, multiply the first gain by the first sound source vector to thereby generate a third sound source vector, and output the generated third sound source vector to a first higher-order linear prediction filter 130.
The second gain circuit 31 has a table in which gain values have been stored. The index output by the minimizing circuit 670 and the second sound source vector output by the second sound source generating circuit 21 enter the second gain circuit 31, which proceeds to read a second gain corresponding to the index out of the table, multiply the second gain by the second sound source vector to thereby generate a fourth sound source vector, and output the generated fourth sound source vector to a second higher-order linear prediction filter 131.
A third higher-order linear prediction coefficient output by a higher-order linear prediction coefficient calculation circuit 180 and a third sound source vector output by the first gain circuit 30 enter the first higher-order linear prediction filter 130. The filter thus set to the third higher-order linear prediction coefficient is driven by the third sound source vector, whereby a first excitation vector is obtained. The first excitation vector is output to a first band-pass filter 135.
A fourth higher-order linear prediction coefficient output by the higher-order linear prediction coefficient calculation circuit 180 and a fourth sound source vector output by the second gain circuit 31 enter the second higher-order linear prediction filter 131. The filter thus set to the fourth higher-order linear prediction coefficient is driven by the fourth sound source vector, whereby a second excitation vector is obtained. The second excitation vector is output to a second band-pass filter 136.
The first excitation vector output by the first higher-order linear prediction filter 130 enters the first band-pass filter 135. The first excitation vector has its band limited by the filter 135, whereby a third excitation vector is obtained. The first band-pass filter 135 outputs the third excitation vector to an adder 40.
The second excitation vector output by the second higher-order linear prediction filter 131 enters the second band-pass filter 136. The second excitation vector has its band limited by the filter 136, whereby a fourth excitation vector is obtained. The fourth excitation vector is output to the adder 40.
The adder 40 adds the inputs applied thereto, namely the third excitation vector output by the first band-pass filter 135 and the fourth excitation vector output by the second band-pass filter 136, and outputs a fifth excitation vector, which is the sum of the third and fourth excitation vectors, to the linear prediction filter 150.
The linear prediction filter 150 has a table in which quantized values of linear prediction coefficients have been stored. The fifth excitation vector output by the adder 40 and an index corresponding to a quantized value of a linear prediction coefficient output by the first linear prediction coefficient calculation circuit 140 enter the linear prediction filter 150. The quantized value of the linear prediction coefficient corresponding to this index is read out of this table and the filter thus set to this quantized linear prediction coefficient is driven by the fifth excitation vector, whereby a reconstructed signal (reconstructed vector) is obtained. This vector is output to a subtractor 50 and to the higher-order linear prediction coefficient calculation circuit 180.
The reconstructed vector output by the linear prediction filter 150 enters the higher-order linear prediction coefficient calculation circuit 180, which proceeds to calculate the third higher-order linear prediction coefficient and the fourth higher-order linear prediction coefficient. The third higher-order linear prediction coefficient is output to the first higher-order linear prediction filter 130, and the fourth higher-order linear prediction coefficient is output to the second higher-order linear prediction filter 131. The details of construction of the higher-order linear prediction coefficient calculation circuit 180 will be described later.
The input vector enters the subtractor 50 via the input terminal 10, and the reconstructed vector output by the linear prediction filter 150 also enters the subtractor 50. The subtractor 50 calculates the difference between these two inputs. The subtractor 50 outputs a difference vector, which is the difference between the input vector and the reconstructed vector, to the weighting filter 160.
The difference vector output by the subtractor 50 and the linear prediction coefficient output by the first linear prediction coefficient calculation circuit 140 enter the weighting filter 160. The latter uses this linear prediction coefficient to produce a weighting filter corresponding to the characteristic of the human sense of hearing and drives this weighting filter by the difference vector, whereby there is obtained a weighted difference vector. The weighted difference vector is output to the minimizing circuit 670. For a discussion of a weighting filter, see Reference 1.
Weighted difference vectors output by the weighting filter 160 successively enter the minimizing circuit 670, which proceeds to calculate the norms.
Indices corresponding to all values of the elements of the first pulse position vector in the first pulse position generating circuit 610 are output successively from the minimizing circuit 670 to the first pulse position generating circuit 610. Indices corresponding to all values of the elements of the second pulse position vector in the second pulse position generating circuit 611 are output successively from the minimizing circuit 670 to the second pulse position generating circuit 611. Indices corresponding to all first pulse amplitude vectors that have been stored in the first pulse amplitude generating circuit 120 are output successively from the minimizing circuit 670 to the first pulse amplitude generating circuit 120. Indices corresponding to all second pulse amplitude vectors that have been stored in the second pulse amplitude generating circuit 121 are output successively from the minimizing circuit 670to the second pulse amplitude generating circuit 121. Indices corresponding to all first gains that have been stored in the first gain circuit 30 are output successively from the minimizing circuit 670 to the first gain circuit 30. Indices corresponding to all second gains that have been stored in the second gain circuit 31 are output successively from the minimizing circuit 670 to the second gain circuit 31. Further, the minimizing circuit 670 selects the value of each element in the first pulse position vector, the value of each element in the second pulse position vector, the first pulse amplitude vector, the second pulse amplitude vector and the first gain and second gain that will result in the minimum norm and outputs the indices corresponding to these to the code output circuit 690.
With regard to a method of obtaining the position of each pulse that is an element of a pulse position vector as well as the amplitude value of each pulse that is an element of a pulse amplitude vector, see Reference 4, by way of example.
The index corresponding to the quantized value of the linear prediction coefficient output by the first linear prediction coefficient calculation circuit 140 enters the code output circuit 690 and so do the indices corresponding to the value of each element in the first pulse position vector, the value of each element in the second pulse position vector, the first pulse amplitude vector, the second pulse amplitude vector and the first gain and second gain. The code output circuit 690 converts these indices to a bit-sequence code and outputs the code via an output terminal 60.
The higher-order linear prediction coefficient calculation circuit 180 will now be described with reference to FIG. 11.
As shown in FIG. 11, the reconstructed vector output by the linear prediction filter 150 enters a second linear prediction coefficient calculation circuit 910 via an input terminal 900. The second linear prediction coefficient calculation circuit 910 subjects this reconstructed vector to linear prediction analysis obtains a linear prediction coefficient and outputs this coefficient to a residual signal calculation circuit 920 as a second linear prediction coefficient.
The second linear prediction coefficient output by the second linear prediction coefficient calculation circuit 910 and the reconstructed vector output by the linear prediction filter 150 enter the residual signal calculation circuit 920, which proceeds to use a filter, in which the second linear prediction coefficient has been set, to subject the reconstructed vector to inverse filtering, whereby a first residual vector is obtained. The first residual vector is output to an FFT (Fast-Fourier Transform) circuit 930.
The FFT circuit 930, to which the first residual vector output by the residual signal calculation circuit 920 is applied, subjects this vector to a Fourier transform and outputs the Fourier coefficients thus obtained to a band splitting circuit 940.
The band splitting circuit 940, to which the Fourier coefficients output by the FFT circuit 930 are applied, equally partitions these Fourier coefficients into high- and low-frequency regions, thereby obtaining low-frequency Fourier coefficients and high-frequency Fourier coefficients. The low-frequency coefficients are output to a first downsampling circuit 950 and the high-frequency coefficients are output to a second downsampling circuit 951.
The first downsampling circuit 950 downsamples the low-frequency Fourier coefficients output by the band splitting circuit 940. Specifically, the first downsampling circuit 950 removes bands corresponding to high frequency in the low-frequency Fourier coefficients and generates first Fourier coefficients the band whereof is half the full band. The first Fourier coefficients are output to a first inverse FFT circuit 960.
The second downsampling circuit 951 downsamples the high-frequency Fourier coefficients output by the band splitting circuit 940. Specifically, the second downsampling circuit 951 removes bands corresponding to low frequency in the high-frequency Fourier coefficients and loops back the high-frequency coefficients to the low-frequency side, thereby generating second Fourier coefficients the band whereof is half the full band. The second Fourier coefficients are output to a second inverse FFT circuit 961.
The first Fourier coefficients output by the first downsampling circuit 950 enter the first inverse FFT circuit 960, which proceeds to subject these coefficients to an inverse FFT, thereby obtaining a second residual vector that is output to a first higher-order linear prediction coefficient calculation circuit 970.
The second Fourier coefficients output by the second downsampling circuit 951 enter the second inverse FFT circuit 961, which proceeds to subject these coefficients to an inverse FFT, thereby obtaining a third residual vector that is output to a second higher-order linear prediction coefficient calculation circuit 971.
The second residual vector output by the first inverse FFT circuit 960 enters the first higher-order linear prediction coefficient calculation circuit 970, which proceeds to subject the second residual vector to higher-order linear prediction analysis, thereby obtaining the first higher-order linear prediction coefficient. This is output to a first upsampling circuit 980.
The third residual vector output by the second inverse FFT circuit 961 enters the second higher-order linear prediction coefficient calculation circuit 971, which proceeds to subject the third residual vector to higher-order linear prediction analysis, thereby obtaining the second higher-order linear prediction coefficient. This is output to a second upsampling circuit 981.
The first higher-order linear prediction coefficient output by the first higher-order linear prediction coefficient calculation circuit 970 enters the first upsampling circuit 980. By inserting zeros in alternation with the first higher-order linear prediction coefficient, the first upsampling circuit 980 obtains an upsampled prediction coefficient. This is output as the third higher-order linear prediction coefficient to the first higher-order linear prediction filter 130 via an output terminal 901.
The second higher-order linear prediction coefficient output by the second higher-order linear prediction coefficient calculation circuit 971 enters the second upsampling circuit 981. By inserting zeros in alternation with the second higher-order linear prediction coefficient, the second upsampling circuit 981 obtains an upsampled prediction coefficient. This is output as the fourth higher-order linear prediction coefficient to the second higher-order linear prediction filter 131 via an output terminal 902.
FIG. 12 is a block diagram showing an example of the construction of an apparatus for decoding speech and music according to the prior art. Components in FIG. 12 identical with or equivalent to those of FIG. 10 are designated by like reference characters.
As shown in FIG. 12, a code in the form of a bit sequence enters from an input terminal 200. A code input circuit 720 converts the bit-sequence code that has entered from the input terminal 200 to an index.
The code input circuit 720 outputs an index corresponding to each element in the first pulse position vector to a first pulse position generating circuit 710, outputs an index corresponding to each element in the second pulse position vector to a second pulse position generating circuit 711, outputs an index corresponding to the first pulse amplitude vector to the first pulse amplitude generating circuit 120, outputs an index corresponding to the second pulse amplitude vector to the second pulse amplitude generating circuit 121, outputs an index corresponding to the first gain to the first gain circuit 30, outputs an index corresponding to the second gain to the second gain circuit 31, and outputs an index corresponding to the quantized value of a linear prediction coefficient to the linear prediction filter 150.
The index output by the code input circuit 720 enters the first pulse position generating circuit 710, which proceeds to generate the first pulse position vector using the position of each pulse specified by the index and output the vector to the first sound source generating circuit 20.
The first pulse amplitude generating circuit 120 has a table in which M-dimensional vectors Axe2x88x92j, j=1, . . . , NA have been stored. The index output by the code input circuit 720 enters the first pulse amplitude generating circuit 120, which proceeds to read an M-dimensional vector Axe2x88x92i corresponding to this index out of the above-mentioned table and to output this vector to the first sound source generating circuit 20 as a first pulse amplitude vector.
The index output by the code input circuit 720 enters the second pulse position generating circuit 711, which proceeds to generate the second pulse position vector using the position of each, pulse specified by the index and output the vector to the second sound source generating circuit 21.
The second pulse amplitude generating circuit 121 has a table in which M-dimensional vectors Bxe2x88x92j, j=1, . . . , NB have been stored. The index output by the code input circuit 720 enters the second pulse amplitude generating circuit 121, which proceeds to read an M-dimensional vector Bxe2x88x92j corresponding to this index out of the above-mentioned table and to output this vector to the second sound source generating circuit 21 as a second pulse amplitude vector.
The first pulse position vector Pxe2x88x92=(Pxe2x88x921, P2, . . . , PM) output by the first pulse position generating circuit 710 and the first pulse amplitude vector Axe2x88x92i=(Ai1, Ai2, . . . , AiM) output by the first pulse amplitude generating circuit 120 enter the first sound source generating circuit 20. The first sound source generating circuit 20 outputs an N-dimensional vector for which the values of the P1st, P2nd , . . . , PMth elements are Ai1, Ai2, . . . , AiM, respectively, and the values of the other elements are zero to the first gain circuit 30 as a first sound source signal vector.
The second pulse position vector Qxe2x88x92=(Q1, Q2, . . . , QM) output by the second pulse position generating circuit 711 and the second pulse amplitude vector Bxe2x88x92j=(Bi1, Bi2, . . . , BiM) output by the second pulse amplitude generating circuit 121 enter the second sound source generating circuit 21. The second sound source generating circuit 21 outputs an N-dimensional vector for which the values of the Q1st, Q2nd, . . . , QMth elements are Bi1, Bi2, . . . , BiM, respectively, and the values of the other elements are zero to the second gain circuit 31 as a second sound source signal.
The first gain circuit 30 has a table in which gain values have been stored. The index output by the code input circuit 720 and the first sound source vector output by the first sound source generating circuit 20 enter the first gain circuit 30, which proceeds to read a first gain corresponding to the index out of the table, multiply the first gain by the first sound source vector to thereby generate a third sound source vector and output the generated third sound source vector to the first higher-order linear prediction filter 130.
The first gain circuit 31 has a table in which gain values have been stored. The index output by the code input circuit 720 and the second sound source vector output by the second sound source generating circuit 21 enter the second gain circuit 31, which proceeds to read a second gain corresponding to the index out of the table, multiply the second gain by the second sound source vector to thereby generate a fourth sound source vector and output the generated fourth sound source vector to a second higher-order linear prediction filter 131.
The third higher-order linear prediction coefficient output by the higher-order linear prediction coefficient calculation circuit 180 and the-third sound source vector output by the first gain circuit 30 enter the first higher-order linear prediction filter 130. The filter thus set to the third higher-order linear prediction coefficient is driven by the third sound source vector, whereby a first excitation vector is obtained. The first excitation vector is output to the first band-pass filter 135.
The fourth higher-order linear prediction coefficient output by the higher-order linear prediction coefficient calculation circuit 180 and the fourth sound source vector output by the second gain circuit 31 enter the second higher-order linear prediction filter 131. The filter thus set to the fourth higher-order linear prediction coefficient is driven by the fourth sound source vector, whereby a second excitation vector is obtained. The second excitation vector is output to the second band-pass filter 136.
The first excitation vector output by the first higher-order linear prediction filter 130 enters the first band-pass filter 135. The first excitation vector has its band limited by the filter 135, whereby a third excitation vector is obtained. The first band-pass filter 135 outputs the third excitation vector to the adder 40.
The second excitation vector output by the second higher-order linear prediction filter 131 enters the second band-pass filter 136. The second excitation vector has its band limited by the filter 136, whereby a fourth excitation vector is obtained. The fourth excitation vector is output to the adder 40.
The adder 40 adds the inputs applied thereto, namely the third excitation vector output by the first band-pass filter 135 and the fourth excitation vector output by the second band-pass filter 136, and outputs a fifth excitation vector, which is the sum of the third and fourth excitation vectors, to the linear prediction filter 150.
The linear prediction filter 150 has a table in which quantized values of linear prediction coefficients have been stored. The fifth excitation vector output by the adder 40 and an index corresponding to a quantized value of a linear prediction coefficient output by the code input circuit 720 enter the linear prediction filter 150. The latter reads the quantized value of the linear prediction coefficient corresponding to this index out of the table and drives the filter thus set to this quantized linear prediction coefficient by the fifth excitation vector, whereby a reconstructed vector is obtained.
The reconstructed vector obtained is output to an output terminal 201 and to the higher-order linear prediction coefficient calculation circuit 180.
The reconstructed vector output by the linear prediction filter 150 enters the higher-order linear prediction coefficient calculation circuit 180, which proceeds to calculate the third higher-order linear prediction coefficient and the fourth higher-order linear prediction coefficient. The third higher-order linear prediction is output to the first higher-order linear prediction filter 130, and the fourth higher-order linear prediction coefficient is output to the second higher-order linear prediction filter 131.
The reconstructed vector calculated by the linear prediction filter 150 is output via the output terminal 201.
In the course of investigations toward the present invention, the following problem has been encountered. Namely, a problem with the conventional apparatus for encoding and decoding speech and musical signals by the above-described band splitting technique is that a large number of bits is required to encode the sound source signals.
The reason for this is that the sound source signals are encoded independently in each band without taking into consideration the correlation between bands of the input signals.
Accordingly, an object of the present invention is to provide an apparatus for encoding and decoding speech and musical signals, wherein the sound source signal of each band can be encoded using a small number of bits.
Another object of the present invention is to provide an apparatus for encoding or decoding speech and musical (i.e., sound) signals with simplified structure and/or high efficiency. Further objects of the present invention will become apparent in the entire disclosure. Generally, the present invention contemplates to utilize the correlation between bands of the input signals upon encoding/decoding in such a, fashion to reduce the entire bit number.
According to a first aspect of the present invention, the foregoing object is attained by providing a speech and musical signal encoding apparatus which, when encoding an input signal upon splitting the input signal into a plurality of bands, generates a reconstructed signal using a multipulse sound source signal that corresponds to each band, wherein a position obtained by shifting the position of each pulse which defines the multipulse signal in the band(s) is used when defining a multipulse signal in the other band(s).
According to a second aspect of the present invention, the foregoing object is attained by providing a speech and musical signal decoding apparatus for generating a reconstructed signal using a multipulse sound source signal corresponding to each of a plurality of bands, wherein a position obtained by shifting the position of each pulse which defines the multipulse signal in the band(s) is used when defining a multipulse signal in the other band(s).
According to a third aspect of the present invention, the foregoing object is attained by providing a speech and musical signal encoding apparatus which, when encoding an input signal upon splitting the input signal into a plurality of bands, generates a reconstructed signal by exciting a synthesis filter by a full-band sound source signal, which is obtained by summing, over all bands, multipulse sound source signals corresponding to respective ones of the plurality of bands, wherein a position obtained by shifting the position of each pulse which defines the multipulse signal in the band(s) is used when defining a multipulse signal in the other band(s).
According to a fourth aspect of the present invention, the foregoing object is attained by providing a speech and musical signal decoding apparatus for generating a reconstructed signal by exciting a synthesis filter by a full-band sound source signal, which is obtained by summing, over all bands, multipulse sound source signals corresponding to respective ones of a plurality of bands, wherein a position obtained by shifting the position of each pulse which defines the multipulse signal in the band(s) is used when defining a multipulse signal in the other band(s).
According to a fifth aspect of the present invention, the foregoing object is attained by providing a speech and musical signal encoding apparatus which, when encoding an input signal upon splitting the input signal into a plurality of bands, generates a reconstructed signal by exciting a synthesis filter by a full-band sound source signal, which is obtained by summing, over all bands, signals obtained by exciting a higher-order linear prediction filter, which represents a microspectrum relating to the input signal of each band, by a multi pulse sound source signal corresponding to each band, wherein a position obtained by shifting the position of each pulse which defines the multipulse signal in the band(s) is used when defining a multipulse signal in the other band(s).
According to a sixth aspect of the present invention, the foregoing object is attained by providing a speech and musical signal decoding apparatus for generating a reconstructed signal by exciting a synthesis filter by a full-band sound source signal, which is obtained by summing, over all bands, signals obtained by exciting a higher-order linear prediction filter, which represents a microspectrum relating to an input signal of each of a plurality of bands, by a multipulse sound source signal corresponding to each band, wherein a position obtained by shifting the position of each pulse which defines the multipulse signal in the band(s) is used when defining a multipulse signal in the other band(s).
According to a seventh aspect of the present invention, the foregoing object is attained by providing a speech and musical signal encoding apparatus which, when encoding an input signal upon splitting the input signal into a plurality of bands, generates a reconstructed signal by exciting a synthesis filter by a full-band sound source signal, which is obtained by summing, over all bands, signals obtained by exciting a higher-order linear prediction filter, which represents a microspectrum relating to the input signal of each band, by a multipulse sound source signal corresponding to each band, wherein a residual signal is found-by inverse filtering of the reconstructed signal using a linear prediction filter for which linear prediction coefficients obtained from the reconstructed signal have been decided, conversion coefficients obtained by converting the residual signal are split into bands, and the higher-order linear prediction filter uses coefficients obtained from a residual signal of each band generated in each band by back-converting the conversion coefficients that have been split into the bands.
According to an eighth aspect of the present invention, the foregoing object is attained by providing a speech and musical signal decoding apparatus for generating a reconstructed signal by exciting a synthesis filter by a full-band sound source signal, which is obtained by summing, over all bands, signals obtained by exciting a higher-order linear prediction filter, which represents a microspectrum relating to an input signal of each of a plurality of bands, by a multipulse sound source signal corresponding to each band, wherein a residual signal is found by inverse filtering of the reconstructed signal using a linear prediction filter for which linear prediction coefficients obtained from the reconstructed signal have been decided, conversion coefficients obtained by converting the residual signal are split into bands, and the higher-order linear prediction filter uses coefficients obtained from a residual signal of each band generated in each band by back-converting the conversion coefficients that have been split into the bands.
According to a ninth aspect of the present invention, in the fifth aspect of the invention a residual signal is found by inverse filtering of the reconstructed signal using a linear prediction filter for which linear prediction coefficients obtained from the reconstructed signal have been decided, conversion coefficients obtained by converting the residual signal are split into bands, and the higher-order linear prediction filter uses coefficients obtained from a residual signal of each band generated in each band by back-converting the conversion coefficients that have been split into the bands.
According to a tenth aspect of the present invention, in the sixth aspect of the invention a residual signal is found by inverse filtering of the reconstructed signal using a linear predictions filter for which linear prediction coefficients obtained from the reconstructed signal have been decided, conversion coefficients obtained by converting the residual signal are split into bands , and the higher-order linear prediction filter uses coefficients obtained from a residual signal of each band generated in each band by back-converting the conversion coefficients that have been split into the bands.
Other features a nd advantages of t he present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same or similar parts throughout the figures thereof.