1. Field of the Invention
The present invention relates to a coding and decoding method for transmitting or accumulating a speech signal at a low bit rate, and more particular, to a code conversion method, in speech communication using different coding and decoding systems, of converting a code obtained by coding speech by a certain system into a code decodable by other system so as to have high sound quality by a small amount of operation, and a device and a program therefor.
2. Description of the Related Art
Widely used as a method of coding a speech signal with high efficiency and at medium or low rates is a method of coding a speech signal separately as a linear prediction (LP) filter and an excitation signal which drives the filter. One of representatives of such method is the code excited linear prediction (CELP). In CELP, by driving a linear prediction filter having a linear prediction coefficient indicative of frequency characteristics of input speech by an excitation signal represented by a sum of an adaptive codebook (ACB) indicative of a pitch cycle of the input speech and a fixed codebook (FCB) composed of random numbers and pulses, a composite speech signal is obtained. At this time, the ACB component and the FCB component are multiplied by the respective gains (ACB gain and FCB gain). Concerning CELP, reference is made to M. Schroeder, “Code Excited Linear Prediction: High Quality Speech at Very Low Bit Rates” (Proc. of IEEE Int. Conf. on Acoust., Speech and Signal Processing, pp. 937-940, 1985) (referred to as Literature 1).
Assuming, for example, interconnection between a 3G mobile network and a wire packet network, there arises a problem in interconnection between standard speech coding systems used in the respective networks in some cases. Although one of most simple solutions to the problem is tandem connection, because from a code string obtained by coding a speech using one standard system, a speech signal is once decoded using the standard system and the decoded speech signal is again coded using the other standard system, this solution might invite degradation of speech quality, increase of a delay and the volume of calculation in general.
On the other hand, the code conversion system of converting a code obtained by coding speech using one standard system into a code decodable by the other standard system has a possibility of solving the above-described problem. Regarding the method of converting a code, reference is made to Hong-Goo Kang et. al, “Improving Transcoding Capability of Speech Coders in Clean and Frame Erasured Channel Environments” (Proc. of IEEE Workshop on Speech Coding 2000, pp. 78-80, 2000) (referred to as Literature 2).
FIG. 12 is a diagram showing one example of a structure of a code conversion device for converting a code obtained by coding speech using a first speech coding system (system A) into a code decodable by a second system (system B). In the system A, it is assumed that coding of a linear prediction coefficient is conducted at every Tfr(A) msec cycle (frame) and coding of such components of an excitation signal as ACB, FCB and a gain is conducted at every Tsfr(A)=Tfr(A)/Nsfr(A) msec cycle (sub-frame), while in the system B, it is assumed that coding of a linear prediction coefficient is conducted at every Tfr(B) msec cycle (frame) and coding of components of an excitation signal is conducted at Tsfr(B)=Tfr(B)/Nsfr(B) msec cycle (sub-frame). Here, the code conversion device described in the Literature 2 conducts, for example, code conversion between ITU-T Standard G. 729 and North American TDMA System Standard IS-641. Assuming the former to be the system A and the latter to be the system B, Tfr(A) will be 10 msec and Tfr(B) will be 20 msec, and Tsfr(A) and Tsfr(B) will be 5 msec.
In the following description, assume that between a frame length Lfr(A) of the system A and a frame length Lfr(B) of the system B, a relationship of Lfr(B)=2·Lfr(A) holds and that the number of sub-frames is Nsfr(A)=2 and Nsfr(B)=4. Here, with 8000 Hz as a sampling frequency, Lfr(A) will be 160 samples, Lfr(B) will be 320 samples, and Lsfr(A) and Lsfr(B) will be 80 samples in the above-described example.
With reference to FIG. 12, each component of the conventional code conversion device will be described.
Input a code string obtained by coding speech by the first system (system A) through an input terminal 10.
A code separation circuit 1010 separates, from the code string applied through the input terminal 10, codes corresponding to a linear prediction coefficient (LP coefficient), ACB, FCB, an ACB gain and an FCB gain, that is, an LP coefficient code, an ACB code, an FCB code and a gain code. Here, assuming that the ACB gain and the FCB gain are coded and decoded in the lump, it will be referred to as a gain and its code as a gain code for the purpose of simplification. Then, output the LP coefficient code to an LP coefficient code conversion circuit 100, the ACB code to an ACB code conversion circuit 200, the FCB code to an FCB code conversion circuit 300 and the gain code to a gain code conversion circuit 400.
The LP coefficient code conversion circuit 100 receives input of the LP coefficient code output from the code separation circuit 1010 to convert the LP coefficient code into a code decodable by the second system (system B). The converted LP coefficient code is output to a code multiplexing circuit 1020.
The ACB code conversion circuit 200 receives input of the ACB code output from the code separation circuit 1010 to convert the ACB code into a code decodable by the system B. The converted ACB code is output to the code multiplexing circuit 1020.
The FCB code conversion circuit 300 receives input of the FCB code output from the code separation circuit 1010 to convert the FCB code into a code decodable by the system B. The converted FCB code is output to the code multiplexing circuit 1020.
The gain code conversion circuit 400 receives input of the gain code output from the code separation circuit 1010 to convert the gain code into a code decodable by the system B. The converted gain code is output to the code multiplexing circuit 1020.
More specific operation of each conversion circuit will be described in the following.
The LP coefficient code conversion circuit 100 decodes a first LP coefficient code applied from the code separation circuit 1010 by an LP coefficient decoding method of the first system (system A) to obtain a first LP coefficient. Next, the circuit 100 quantizes and codes the first LP coefficient by LP coefficient quantization method and coding method of the second system (system B) to obtain a second LP coefficient code. Then, the circuit outputs the obtained code as a code decodable by an LP coefficient decoding method of the second system (system B) to the code multiplexing circuit 1020.
The ACB code conversion circuit 200 re-reads a first ACB code applied from the code separation circuit 1010 in terms of a corresponding relationship between the codes in the first system (system A) and the codes in the second system (system B) to obtain a second ACB code. Then, the circuit 200 outputs the obtained code as a code decodable by an ACB decoding method of the second system (system B) to the code multiplexing circuit 1020.
Here, with reference to FIG. 13, description will be made of re-reading of a code. Assume, for example, when the ACB code iT(A) in the system A is 56, its corresponding ACB delay T(A) is 76. In the system B, assuming that when ACB code iT(A) is 53 and its corresponding ACB delay T(A) is 76, in order to convert the ACB code from the system A to the system B such that the value of the ACB delay is the same (76 in this case), it is only necessary to make the ACB code 56 in the system A correspond to the ACB code 53 in the system B. The description of re-reading of a code is completed here to again return to the description of FIG. 12.
The FCB code conversion circuit 300 obtains a second FCB code by re-reading a first FCB code applied from the code separation circuit 1010 in terms of the corresponding relationship between codes in the first system (system A) and codes in the second system (system B). Then, the circuit 300 outputs the obtained code as a code decodable by an FCB decoding method of the second system (system B) to the code multiplexing circuit 1020. Here, re-reading of a code can be realized by the same method as that described above for the conversion of the ACB code or by the same method as that for the conversion of the LP coefficient code which will be described later.
The gain code conversion circuit 400 decodes a first gain code applied from the code separation circuit 1010 by a gain decoding method of the first system (system A) to obtain a first gain. Next, the circuit 400 quantizes and codes the first gain by gain quantization method and coding method of the second system (system B) to obtain a second gain code. Then, the circuit outputs the gain code as a code decodable by a gain decoding method of the second system (system B) to the code multiplexing circuit 1020.
Here, since conversion of the gain code can be realized by the same method as that for the conversion of the LP coefficient code, noting only the conversion of the LP coefficient code for the purpose of simplification, it will be described in detail in the following.
With reference to FIG. 14, each component of the LP coefficient code conversion circuit 100 will be described.
Since in many of the standard systems including the above-described ITU-T standard G.729, an LSP is coded and decoded with an LP coefficient expressed by a linear spectral pair (LSP), it is assumed in the following that the LP coefficient is expressed by the LSP. Here, as to conversion from the LP coefficient to the LSP and conversion from the LSP to the LP coefficient, reference is made to a well-known method, for example, recitation in the 3.2.3 section and the 3.2.6 section in “Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP)” (ITU-T Recommendation G. 729) (referred to as Literature 3).
An LP coefficient decoding circuit 110 decodes the LP coefficient code to obtain the corresponding LSP. The LP coefficient decoding circuit 110, which includes a first LSP codebook 111 in which a plurality of sets of LSP are stored, receives input of the LP coefficient code output from the code separation circuit 1010 through an input terminal 31 and reads an LSP corresponding to the LP coefficient code from the first LSP codebook 111 to output the read LSP to an LP coefficient modification circuit 120. Here, decoding the LSP from the LP coefficient code is conducted according to the LP coefficient (represented by LSP here) decoding method of the system A using an LSP codebook of the system A.
The LP coefficient modification circuit 120 receives input of the LSP output from the LP coefficient decoding circuit 110 and modifies the LSP to output the LSP modified (modified LSP) to an LP coefficient coding circuit 130. Here, assuming that a relationship between a frame length in the system A and a frame length in the system B is expressed as Lfr(B)=2·Lfr(A), modification of the LSP can be conducted based on, for example, the following expression because as shown in FIG. 15, two frames in the system A (a (2n−1)th frame and a 2n-th frame) correspond to one frame (an n-th frame) in the system B:{tilde over (q)}(A)(n)=0.5·(q(A)(2n−1)+q(A)(2n))
wherein the following expression represents the modified LSP (i.e. output of the LP coefficient modification circuit 120) in the system A and is used in the n-th frame in the system B:{tilde over (q)}(A)(n)
q(A)(m) denotes the LSP output from the LP coefficient decoding circuit 110 in the m-th frame of the system A. In addition, assume that q(A)(n) and the following expression represent P-dimensional vectors (P: linear prediction degree):{tilde over (q)}(A)(n)
For the modification of the LSP, such a simpler method based on the following expression can be also used:{tilde over (q)}(A)(n)=q(A)(2n)
As to a more complicated modification method, recitation in the third section of the Literature 2 will be referred to.
The LP coefficient coding circuit 130 receives input of the modified LSP output from the LP coefficient modification circuit 120, reads an LSP and its corresponding code from a second LSP codebook 131 in which a plurality of sets of LSP are stored and quantizes and codes the modified LSP to output the obtained code, that is, the LP coefficient code, to the code multiplexing circuit 1020 through an output terminal 32. Here, quantization and coding of the modified LSP are conducted according to the LP coefficient quantization method and coding method in the system B using an LSP codebook of the system B.
With reference to FIG. 16, each component of the LP coefficient coding circuit 130 will be described.
The second LSP codebook 131, which stores a plurality of sets of LSP, outputs the LSP and its corresponding code to an evaluation value calculation circuit 132.
The evaluation value calculation circuit 132 receives input of the modified LSP output from the LP coefficient modification circuit 120 through an input terminal 33, reads an LSP and its corresponding code from the second LSP codebook 131 in which a plurality of sets of LSP are stored and calculates an evaluation value from the same to output the evaluation value and the code to an evaluation value minimizing circuit 133. Calculation of the evaluation value is conducted for all the LSP stored in the LSP codebook. Evaluation value is defined as a square error of the modified LSP as a target and the LSP stored in the LSP codebook and is expressed by the following expression:
            D      k        ⁢          (      n      )        =            ∑              i        =        1            P        ⁢                  (                                                            q                ~                            i                        ⁢                          (              n              )                                -                                                    q                ^                                            k                ,                i                                      ⁢                          (              n              )                                      )            2      
wherein Dk(n) denotes an evaluation value in the n-th frame, the following expressions each represent an i-th element:{tilde over (q)}i(n)and{circumflex over (q)}k,i(n)
of the following P-dimensional vectors (P: linear prediction degree):{tilde over (q)}(n)={tilde over (q)}(A)(n)and{circumflex over (q)}k(n)
the following expression represents a modified LSP in the n-th frame:{tilde over (q)}(n)={tilde over (q)}(A)(n)
the following expression represents an LSP read from the LSP codebook in the n-th frame:{circumflex over (q)}k(n)and
the following expression represents the size of the LSP codebook (the number of LSP sets stored):k=1, . . . , NqcbNqcb
The evaluation value minimizing circuit 133 receives input of the evaluation value output from the evaluation value calculation circuit 132 and the code corresponding to the LSP used in the calculation of the evaluation value, selects the code with which the evaluation value is the minimum to output the selected code as the LP coefficient code to the code multiplexing circuit 1020 through the output terminal 32.
The description of the LP coefficient coding circuit 130 and the LP coefficient code conversion circuit 100 including the same is completed here to return again to the description of FIG. 12.
The code multiplexing circuit 1020 receives input of the LP coefficient code output from the LP coefficient code conversion circuit 100, the ACB code output from the ACB code conversion circuit 200, the FCB code output from the FCB code conversion circuit 300 and the gain code output from the gain code conversion circuit 400 to output a code string obtained by multiplexing these codes through an output terminal 20.
The above-described conventional code conversion device, however, has a problem that in the conversion of a code corresponding to such a parameter as a linear prediction coefficient or a gain, allophone might be generated in decoded speech which is generated from a converted code.
The reason is that a desirable mode of change in time of the parameter obtained from speech applied to a coder in the first system and a mode of change in time of the parameter obtained by decoding the coded code by a decoder in the second system largely differ from each other.
This derives from the fact that a mode of change in time of the parameter obtained by decoding the code output from the first system by a parameter decoding method of the first system is already different from a desired mode of change in time of the parameter obtained from the input speech because of quantization in the first system, and the parameter obtained by decoding is further quantized by a parameter quantization method of the second system.