The present invention relates to a speech transmission method which is applied to, for example, a mobile radio communication system in which channel errors occur frequently.
With a speech transmission method for use in a mobile radio communication system of the type wherein channel errors occur frequently, an error correction coding technique is used to suppress the deterioration of transmitted speech quality which is caused by the channel errors. In a mobile radio communication system of the type wherein burst errors occur frequently, however, the deterioration of speech quality cannot sufficiently be suppressed at present even by powerful error correcting codes. The reason for this is that the density of errors in a burst is so high that it is difficult to completely remove the errors even if powerful error correcting codes are used. On this account, the situation quite often arises where not all errors are corrected even by the use of the error correction coding technique.
A speech, if decoded from coded information containing errors left uncorrected, will seriously be distorted. To suppress the distortion, it is conventional to utilize a system configuration in which a decoder is equipped with an error detecting function and when an error is detected in the code after error correction processing, the code is subjected to processing different from an ordinary decoding process, that is, waveform recovering of missing speech segments (which process will hereinafter be referred to as interpolation) to thereby suppress the influence of the channel error.
Referring now to FIG. 1, the effect of interpolation will be described. In FIG. 1 the abscissa represents time and Row A shows partitioning of an input speech signal into speech coding frames (hereinafter referred to as speech frames) and Row B an original speech signal waveform. Row C shows a speech signal waveform decoded when a channel error remained uncorrected in the speech code of an ith speech frame, and in this case the decoded waveform of the ith speech frame is unnatural. Row D shows a speech signal waveform decoded using the above-mentioned interpolation for the channel error left uncorrected; in this instance, the decoded speech signal waveform of the ith speech frame is closer to the original speech signal waveform.
The interpolation processing mentioned herein is to decode the speech waveform signal by continuously repeating a periodic portion of the immediately preceding speech frame. With the use of such interpolation processing, it is possible to suppress the distortion of the decoded waveform which is caused by channel errors. In conventional speech code transmission systems, however, no particular consideration has been given to the implementation of an efficient interpolation method.
A conventional speech code transmission system will be described below as being applied to a 6-channel TDM (Time Division Multiplexing) transmission system shown in FIG. 2. In FIG. 2 input speech signals Sa to Sf are respectively subjected to speech/channel coding by speech/channel coding units 11a to 11f for each speech frame and then TDM multiplexed by a TDM multiplexer 12 for transmission. At the receiving end the multiplexed code sequence is TDM demultiplexed by a TDM demultiplexer 13 and the demultiplexed codes are respectively decoded by speech/channel decoding units 14a to 14f into decoded speech waveforms Sa' to Sf'. In FIG. 3 there are shown more in detail the relationships between speech coding, channel coding and TDM multiplexing in conjunction with only the speech signal Sa in the interests of brevity.
In FIG. 3, Row A shown partitioning of the input speech signal waveform Sa into speech frames 1, 2, . . . . The speech signal is coded for each speech frame of a length equal to one TDM period (which is L sec and is called a TDM frame as well) to obtain speech codes F11, F12, F13, . . . depicted on Row B. Incidentally, numerals in rectangular boxes represent corresponding input speech frame numbers. As such speech coding methods wherein the speech signal is divided or partitioned into fixed frames and coded into a fixed number of bits for each frame, there have been several methods such as CELP (Code Excited Linear Predictive) coding, LD-CELP (Low Delay CELP) coding, TC-WVQ (Transform Coding With Weighted Vector Quantization) and VSELP (Vector Sum Excited Linear Predictive) coding. The present invention can be used with those conventional systems as long as the speech signal is partitioned at regular time intervals and then coded into a fixed number of bits for each frame.
The speech codes are subjected to error correction/detection coding (hereinafter referred to as channel coding) to provide a code train or channel codes F21, F22, F23, . . . shown on Row C. Compared with the speech codes F11, F12, . . . , the channel codes F21, F22, . . . each have its number of bits increased corresponding to redundancy bits of the error correction/detection code. As shown on Row D, the channel codes F21, F22, . . . are each inserted in, for example, a time slot #1 in each TDM frame and TDM multiplexed with channel codes in other time slots, thereafter being transmitted. At the receiving end the speech/channel decoding unit 14a, which is to receive the speech signal Sa, decodes the TDM demultiplexed channel code of each time slot #1 to obtain the decoded speech signal Sa' of one speech frame length shown on Row E.
Now, let it be assumed that a channel error caused in the channel code F22 corresponding to the second speech frame could not have been corrected at the receiving end even by an error correction code. In this instance, it is necessary to interpolate the erroneous speech code or speech waveform with the error-free speech code F12 in the second speech frame or its decoded speech waveform. The length of the speech frame to be interpolated is L sec, which is equal to the TDM period (i.e. the time length from the time slot #1 tO #6). On this account, in the method depicted in FIG. 3 the TDM period is predetermined, and when it is long, the interpolation period L sec also increases. In general, the speech waveform in conversations can be regarded as substantially steady-state when the speech frame length is 20 to 50 ms or so, but when the speech frame is longer, the speech waveform is considered to undergo variations. Thus, when the speech frame length L is in excess of 50 ms, the speech frame containing a channel error cannot always be decoded into a speech of good quality, even if it is interpolated with the immediately preceding frame.
N. JAYANT et al. have proposes a DPCM packet transmission method wherein a series of quantize error samples of each speech frame are arranged into an odd-sample group and an even-sample group and are transmitted with two adjacent packets, and if one of the packets is lost due to a channel error, a required number of samples are derived from the samples of the other packet by means of interpolation (IEEE, TRANS. ON COMM., VOL.COM-29, NO. 2, FEB. 1981, pp. 101-109). This method is defective in that since peaks of samples are flattened by interpolation, a speech decoded from the interpolated samples will be distorted.
In the case where a certain speech frame needs to be interpolated at the receiving end because a channel error still remains in the speech code decoded from the channel code corresponding to the speech frame, if the speech frame can be interpolated using speech waveform information following, in time sequence, the speech frame as well as speech waveform information preceding it in time sequence, then the decoded speech of such an interpolated speech frame will be less distorted. In the case of interpolating a certain speech frame, however, waveform information of the next speech frame is needed in advance, and accordingly the required transmission delay time will increase by the waiting time therefor. In such a duplex communication system as the telephone the reduction of the transmission delay time is required because an increase of the transmission delay time in both parties' speeches will hinder their conversation.
Moreover, in the case where channel errors are left uncorrected over several speech frames before and after the speech frame to be interpolated, information of the preceding and following speech frames necessary for the interpolation are lost, and consequently, it is difficult to obtain an interpolated speech of good quality. Therefore, it is desirable to keep low the interpolation probability (the probability of channel errors remaining uncorrected) of speech frames before and after the speech frame to be interpolated.