A microfiche appendix is included in this disclosure. Specifically, Appendix B is a plurality of tables utilized by the computer source code listing. The total number of microfiche is 1. The total number of frames is 23.
1. Technical Field
The present invention relates generally to speech encoding and decoding in voice communication systems; and, more particularly, it relates to various techniques used with code-excited linear prediction coding to obtain high quality speech reproduction through a limited bit rate communication channel.
2. Related Art
Signal modeling and parameter estimation play significant roles in communicating voice information with limited bandwidth constraints. To model basic speech sounds, speech signals are sampled as a discrete waveform to be digitally processed. In one type of signal coding technique called LPC (linear predictive coding), the signal value at any particular time index is modeled as a linear function of previous values. A subsequent signal is thus linearly predictable according to an earlier value. As a result, efficient signal representations can be determined by estimating and applying certain prediction parameters to represent the signal.
Applying LPC techniques, a conventional source encoder operates on speech signals to extract modeling and parameter information for communication to a conventional source decoder via a communication channel. Once received, the decoder attempts to reconstruct a counterpart signal for playback that sounds to a human ear like the original speech.
A certain amount of communication channel bandwidth is required to communicate the modeling and parameter information to the decoder. In embodiments, for example where the channel bandwidth is shared and real-time reconstruction is necessary, a reduction in the required bandwidth proves beneficial. However, using conventional modeling techniques, the quality requirements in the reproduced speech limit the reduction of such bandwidth below certain levels.
Speech encoding becomes increasingly difficult as transmission bit rates decrease. Particularly for noise encoding, perceptual quality diminishes significantly at lower bit rates. Straightforward code-excited linear prediction (CELP) is used in many speech codecs, and it can be very effective method of encoding speech at relatively high transmission rates. However, even this method may fail to provide perceptually accurate signal reproduction at lower bit rates. One such reason is that the pulse like excitation for noise signals becomes more sparse at these lower bit rates as less bits are available for coding and transmission, thereby resulting in annoying distortion of the noise signal upon reproduction.
Many communication systems operate at bit rates that vary with any number of factors including total traffic on the communication system. For such variable rate communication systems, the inability to detect low bit rates and to handle the coding of noise at those lower bit rates in an effective manner often can result in perceptually inaccurate reproduction of the speech signal. This inaccurate reproduction could be avoided if a more effective method for encoding noise at those low bit rates were identified.
Additionally, the inability to determine the optimal encoding mode for a given noise signal at a given bit rate also results in an inefficient use of encoding resources. For a given speech signal having a particular noise component, the ability to selectively apply an optimal coding scheme at a given bit rate would provide more efficient use of an encoder processing circuit. Moreover, the ability to select the optimal encoding mode for type of noise signal would further maximize the available encoding resources while providing a more perceptually accurate reproduction of the noise signal.
A random codebook is implemented utilizing overlap in order to reduce storage space. This arrangement necessitates reference to a table or other index that lists the energies for each codebook vector. Accordingly, the table or other index, and the respective energy values, must be stored, thereby adding computational and storage complexity to such a system.
The present invention re-uses each table codevector entry in a random table with xe2x80x9cLxe2x80x9d codevectors, each of dimension xe2x80x9cN.xe2x80x9d That is, for example, an exemplary codebook contains codevectors V0, V1, . . . , VL, with each codevector Vx being of dimension N, and having elements C0, C1, . . . CNxe2x88x921, CN. Each codevector of dimension N is normalized to an energy value of unity, thereby reducing computational complexity to a minimum.
Each codebook entry essentially acts as a circular buffer whereby N different random codebook vectors are generated by specifying a starting point at each different bit in a given codevector. Each of the different N codevectors then has unity energy.
The dimension of each table entry is identical to the dimension of the required random codevector and every element in a particular table entry will be in any codevector derived from this table entry. This arrangement dramatically reduces the necessary storage capacity of a given system, while maintaining minimal computational complexity.