Voice coders (vo-coders) are known in the art. Regarding communication systems, the goal of any vo-coder is to encode a speech signal for transmission over a channel. Since communication channels are often quite limited in information carrying capacity (bandwidth), the amount of encoded information required for transmission is preferably minimized. Thus the vo-coding process usually entails compressing the information signal by discarding redundant spectral elements (or other unnecessary information), while retaining only that information that when transmitted to a receiver, allows necessary components to be regenerated (or inferred) thereby permitting the synthesis of a perceptually acceptable recreation of the original speech input. Those skilled in the art will appreciate that a speech signal contains a large amount of redundant or unnecessary information.
Speech production can be modeled as an excitation signal (e.g., sound impulses generated by the vocal cords), driving a filter (e.g., the vocal tract), which possesses a certain resonant structure. The spoken sound changes with time since both the excitation signal and/or filter vary with time. The excitation is noise-like for unvoiced sounds (e.g., consonants), and appears periodic for voice sounds (e.g., vowels). Predominantly, and especially for voiced sounds, most of the essential speech energy is concentrated in only a few frequency sub-bands and these particular frequency bands containing the most energy generally vary slowly over time. It has been found that transmitting only the information contained about these spectral peaks is all that is normally required to provide a reasonable reconstruction of the input speech. This approach forms the basis for the well-known digital adaptive sub-band vo-coder, which attempts to allocate a fixed number of bits among a plurality of spectral sub-bands, such that the accuracy of the reproduction of the speech signal component in the highest energy sub-bands are maximized.
In an effort to minimize the data rate of the vo-coder, systems employing conventional low bit-rate digital coded speech typically exhibit substantial degradation in audio quality from the original speech signal. The users of a radio communication system employing such a coder typically experience this degradation, invariant with the receiver's distance from the transmitting antenna. Thus, regardless of whether the receiver is 25 miles or 25 yards from the transmitter antenna, the achieved audio quality remains essentially fixed so long as there are no bit errors, at which point, further degradation occurs.
Generally, contemporary vo-coder designers advantageously exploit the advantages of digital signal processing, such as, for example, the operational repeatability of digital filters, the immunity of digital circuits to variations due to aging, and the natural invariance of digital circuits to temperature, humidity, vibration, and other adverse conditions. Also, contemporary methods for transmitting information from digital speech coders produces spectral inefficiencies, which can compromise the benefits achieved in removing the redundant speech information. For example, it is known that a high quality analog unprocessed speech signal occupies approximately 4 kHz of bandwidth. After digitization (via pulse code modulation (PCM), the digital representation of this signal has a data rate of 64 kb/s, which occupies approximately 30 kHz of bandwidth (assuming the use of conventional binary channel modulation techniques). Even after considerable processing in a conventional sub-band coder to remove the less important spectral elements (thereby providing moderate audio quality at a rate as low as 10 kb/s), transmission of the speech signal using binary modulation still requires more bandwidth than the original analog signal.
While traditional multi-level modulation techniques (where a channel symbol is used to code more than a single bit) can be utilized to reduce the necessary transmission bandwidth, this is done so at the expense of robustness to channel impairments. It is essential that the reduced (minimized) number of transmitted bits be correctly received. For contemporary low bit rate speech coders, error rates (due to noise or channel fading) of less than one percent may render an unprotected signal unintelligible. In conventional land mobile channels, achieving this low error rate is especially difficult due to multipath fading. Thus, it is common practice to add error coding to the transmitted signal to permit error detection or correction of the bits representing the speech signal. However, the additional coding increases the number of transmitted bits, and therefore further reduces the spectral efficiency of the system. Some designers have attempted to compensate for this detriment by selectively encoding a subset of the transmitted bits.
Therefore, a need exists in the art to provide a method to transmit information that has been processed in a voice coder, that meets the combined communication goals of reliability and maximum spectral efficiency, while providing superior audio quality in the recovered signal.