Many communication systems, such as cellular telephone and personal communications systems, rely on wireless channels to communicate information. In the course of communicating such information, wireless communication channels can suffer from several sources of error, such as multipath fading. These error sources can cause, among other things, the problem of frame erasure. An erasure refers to the total loss or substantial corruption of a set of bits communicated to a receiver. A frame is a predetermined fixed number of bits which the communication system treats as a single entity for purposes of communication.
If a frame of bits is totally lost, then the receiver has no bits to interpret. Under such circumstances, the receiver may produce a meaningless result. If a frame of received bits is corrupted and therefore unreliable, the receiver may produce a severely distorted result.
As the demand for wireless system capacity has increased, a need has arisen to make the best use of available wireless system bandwidth. One way to enhance the efficient use of system bandwidth is to employ a signal compression technique. For wireless systems which carry speech signals, speech compression (or speech coding) techniques may be employed for this purpose. Such speech coding techniques include analysis-by-synthesis speech coders, such as the well-known code-excited linear prediction (or CELP) speech coder.
The problem of packet loss in packet-switched networks employing speech coding arrangements is very similar to frame erasure in the wireless context. That is, due to packet loss, a speech decoder may either fail to receive a frame or receive a frame having a significant number of missing bits. In either case, the speech decoder is presented with the same essential problem-- the need to synthesize speech despite the loss of compressed speech information. Both "frame erasure" and "packet loss" concern a communication channel (or network) problem which causes the loss of transmitted bits. For purposes of this description, therefore, the term "frame erasure" may be deemed synonymous with packet loss.
CELP speech coders employ a codebook of excitation signals to encode an original speech signal. These excitation signals are used to "excite" a linear predictive (LPC) filter which synthesizes a speech signal (or some precursor to a speech signal) in response to the excitation. The synthesized speech signal is compared to the signal to be coded. The codebook excitation signal which most closely matches the original signal is identified. The identified excitation signal's codebook index is then communicated to a CELP decoder. (Depending upon the type of CELP system, other types of information may be communicated as well.) The decoder contains a codebook identical to that of the CELP encoder. The decoder uses the transmitted index to select an excitation signal from its own codebook. This selected excitation signal is used to excite the decoder's LPC filter. Thus excited, the LPC filter of the decoder generates a decoded (or quantized) speech signal (referred to herein as the "reconstructed speech signal")-- the same speech signal which was previously determined to be closest to the original speech signal.
One particular CELP coding system is the well-known 16 kbit/s low-delay CELP (LD-CELP) speech coding system adopted by the CCITT as its international standard known as "Recommendation G.728." In this system,/br example, the 1024-entry (i.e., 10-bit) codebook is decomposed into two smaller codebooks-- a 7-bit "shape codebook" containing 128 independent codevectors and a 3-bit "gain codebook" containing 8 scalar values. The former codebook's codevectors represent the shape of the excitation signal whereas the latter codebook's values represent a gain factor which is to be applied to these codevectors. Thus, the excitation signal index which is transmitted to the decoder comprises two parts-- one which identifies the codevector to be retrieved from the corresponding shape codebook found in the decoder (a 7-bit index), and one which identifies a gain factor to be applied thereto (a 3-bit index). In a G.728 CELP coding system, such a (10-bit) excitation signal index is transmitted for each set of five contiguous speech samples, the speech samples having been sampled at a rate of 8 kHz. This set of five samples is known as a "vector." Each frame comprises a fixed number of such "vectors"(e.g., 16).
Systems which employ speech coders may be more sensitive to the problem of frame erasure than those systems which do not compress speech. This sensitivity is due to the reduced redundancy of coded speech (compared to uncoded speech) making the possible loss of each communicated bit more significant. In the context of a CELP speech coder experiencing frame erasure, excitation signal codebook indices may be either lost or substantially corrupted. Because of erased frames, the decoder will not be able to reliably identify which entries in its codebook should be used to synthesize speech. As a result, speech coding system performance may degrade significantly.
Most prior attempts to rectify the problem of frame erasure have required that either the speech decoder or both the speech decoder and the speech encoder be modified to improve the performance of the system in the presence of such erasures. However, when a standardized coding system such as G.728 is employed, it may not be possible or desirable to modify these components. This is particularly true in the case where standard "off-the-shelf" components are used to implement the encoder and decoder. For example, if a standard coding system such as G.728 is implemented with VLSI (Very Large Scale Integration) ASIC (Application-Specific Integrated Circuit) chips, it is not possible to modify the decoder or the encoder and yet still make use of these chips. Alternatively, if the coding system is implemented with a general purpose processor such as a DSP (digital signal processor), but the decoder and encoder program code consist of vendor-supplied software provided only in object code (as opposed to source code) form, it may not be possible to modify the program code to alter the behavior of the decoder or the encoder.