Human speech has frequencies up to 20 KHz, but current analog and digital communications systems that carry telephone traffic or devices that can store and playback speech typically support only band-limited speech signals. In the case of telephony, the supported speech bandwidth, known as the voice-band, is from 300 Hz to 3.4 KHz. The limited support of the voice spectrum causes a loss of quality of speech in a number of ways. Unvoiced sounds such as /s/ and /f/ have energies mostly above 4 KHz and therefore are highly attenuated. This leads to a significant loss of intelligibility, since unvoiced sounds are central to highly intelligible speech. The loss of intelligibility is even more pronounced if the listening environment itself is noisy. Speech signals that are limited to 4 KHz are often perceived as muffled and monotonous. Narrowband voice coders that are widely used in wireless networks such as CELP (Code Excited Linear Prediction) and its derivatives cause further loss of brightness due to the noisy excitation signals kept in codebooks. The limited support of the voice spectrum causes a loss of quality of speech in a number of ways.
In the area of speech coding, many advances have been made to the compress and decompress human speech because of the high degree of redundancy in a speech signal. The majority of the speech converters (such as, for example decoders and encoders) developed to date (such as the ITU G. series) are designed to operate on 8 KHz sampled digital speech signals, implying a 4 KHz bandwidth. Some wideband coders, such as G.722, operate on 16 KHz sampled digital signals, where the bandwidth is 8 KHz wide.
The quality difference between 8 KHz bandwidth, referred to here as wideband, and the 4 KHz bandwidth speech, referred to here as narrowband, is significant. A wideband speech communication typically is of higher quality than a narrowband speech communication, as a result of the increased bandwidth of the wideband communication. Similarly, a broadband speech communication typically is of higher quality than a wideband speech communication. Such a quality difference between narrowband speech signals, on one hand, and either wideband or broadband speech signals, on the other hand, becomes significant in circumstances where, for example, a communications device that is capable of communicating a higher-quality wider bandwidth speech communication receives as an input a lower-quality narrower bandwidth speech communication. Such narrower bandwidth speech communication may be band limited as a result of upstream voice coders or other band-limiting influences. Ordinarily in circumstances of this sort, when a wider bandwidth device receives as an input only a narrower bandwidth speech communication, the higher quality speech communication capabilities of the wider bandwidth device are not utilized. The inventor of the present invention has recognized the opportunities presented by this underutilization of wider bandwidth device capabilities.
Various methods have been described in the past in an effort to help address the issue of quality disparity between narrower bandwidth speech communications and wider bandwidth devices. These methods include, for instance, linear predictive coding (LPC), auto-regressive modeling, spectral analysis, and Gaussian Mixture Model (GMM) modeling. These methodologies, however, each have one or more shortcomings or other drawbacks, and certain of the shortcomings or drawbacks may be common to more than one methodology. Examples of such shortcomings or other drawbacks include, without limitation: the methodology introduces objectionable artifacts into the signal; the methodology in the past has failed to adequately account for noise that is present in the communication in combination with the desired speech; the methodology, at least if it is a statistical methodology, may require training on a corpus of speech vectors leading to statistical models with language dependency problems; the methodology makes use of highly complex algorithmic solutions which, because of associated increased power requirements, are not well-suited for battery-powered devices such as a cellular handset; and/or the methodology uses large codebooks and feature vectors (such as, for example, those that may be extracted from a narrowband speech signal), thereby requiring significant memory utilization. As a result, the communications industry still lacks a compelling solution.
Furthermore, quality issues related to speech communications are not confined to the afore-mentioned distinction between the amount of bandwidth that narrower bandwidth speech communications support as compared to the higher bandwidth capabilities of wider bandwidth devices. In other words, aside from whether there is any increased bandwidth opportunity for a given bandwidth-limited speech signal, a speech communication of a given bandwidth can be or become degraded or otherwise lacking in quality. Indeed, one or more components of the supported speech communication frequency spectrum of a given speech communication may be, for example, missing, degraded or otherwise subject to unwanted artifacts. Such a condition is not necessarily limited to narrowband speech communications, but rather might also be found to occur in wideband or even broadband speech communications. The result may be a speech communication of diminished quality as compared against the quality potential that the bandwidth of the given speech communication is otherwise capable of supporting.