As a method of efficiently encoding speech signals at middle bit rates or low bit rates, one widely used method separates a speech signal into an LP (Linear Prediction) filter and an excitation signal for driving it and then encodes the speech signal. One representative method is CELP (Code Excited Linear Prediction). CELP drives an LP filter, which has set therein LP coefficients representative of frequency characteristics of an input speech, with an excitation signal represented by the sum of an adaptive codebook (ACB) representative of the pitch period of the input speech and a fixed codebook (FCB) made up of a random number and a pulse to generate a synthetic speech signal. In this event, an ACB component and an FCB component are multiplied by gains (ACB gain and FCB gain), respectively. For CELP, see, for example, M. Schroeder, “Code excited linear prediction: High quality speech at very low bit rates,” Proc. of IEEE Int. Conf. on Acoust., Speech and Signal Processing, pp. 937-940, 1985.
Assuming, for example, an interconnection between a 3G (Third Generation) mobile network and a wired packet network, a problem arises in that these networks cannot be directly connected because the respective networks employ different standard speech encoding scheme. As a solution to this, a tandem connection can be contemplated.
FIG. 1 illustrates an example of a conventional code conversion apparatus based on the tandem connection, where codes generated by encoding a speech using a first speech coding scheme are converted into codes which can be decoded in accordance with a second speech coding scheme. The second speech coding scheme is generally different from the first speech coding scheme. In the following, for simplicity of description, the first speech coding scheme is simply called “Scheme 1,” and codes generated by encoding a speech using the first speech coding scheme is called “first code string data.” Likewise, the second speech coding scheme is simply called “Scheme 2,” and codes generated by encoding a speech using the second speech coding scheme is called “second code string data.” Assume that code string data is communicated at a frame period (for example, a period of 20 milliseconds) which is the processing unit of speech encoding/decoding. For a speech encoding method and decoding method, see the aforementioned Schroeder's article, or 3GPP standard: “AMR Speech codec: Transcoding functions” (3GPP TS 26.090).
Referring to FIG. 1, the following description will be given of a conventional code conversion apparatus based on the tandem connection.
In the code conversion apparatus, input terminal 10, speech decoding circuit 1050, speech encoding circuit 1060, and output terminal 20 are connected in series in this order. Speech decoding circuit 1050 decodes a speech from first code string data applied thereto through input terminal 10 by a decoding method conforming to Scheme 1, and supplies the decoded speech to speech encoding circuit 1060 as a first decoded speech. Speech encoding circuit 1060 receives the first decoded speech delivered from speech decoding circuit 1050, and delivers code string data, generated by encoding the first decoded speech by a second speech coding method, through output terminal 20 as second code string data.
However, the foregoing conventional code conversion apparatus based on the tandem connection re-encodes a decoded speech signal, generated by once decoding applied first code string data by the speech decoding circuit of Scheme 1, as it is by the speech encoding circuit of Scheme 2 even though its signal characteristics are not suitable for re-encoding due to a deterioration resulting from the coding, and therefore has a challenge that the speech quality deteriorates in a finally decoded speech if the second code string data generated by these code conversions is decoded in accordance with Scheme 2.