The state-of-the-art is believed to be represented by the following publications:
1. “Speech enhancement via frequency bandwidth extension using line spectral frequencies”, Chennoukh, S.; Gerrits, A.; Miet, G.; Sluijter, R.; IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP'01).2001 sVolume 1, 7-11 May 2001
The abstract of the above publication states that it “contributes to narrowband speech enhancement by means of frequency bandwidth extension. A new algorithm is proposed for generating synthetic frequency components in the high-band (i.e., 4-8 kHz) given the low-band ones (i.e., 0-4 kHz) for wide-band speech synthesis. It is based on linear prediction (LPC) analysis-synthesis. It consists of a spectral envelope extension using efficiently line spectral frequencies (LSF) and a bandwidth extension of the LPC analysis residual using a spectral folding. The low-band LSF of the synthesis signal are obtained from the input speech signal and the high-band LSF are estimated from the low-band ones using statistical models. This estimation is achieved by means of four models that are distinguished by means of the first two reflection coefficients obtained from the input signal linear prediction analysis.”
2. “HMM-based frequency bandwidth extension for speech enhancement using line spectral frequencies”, Chen, G.; Parsa, V.; IEEE Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04).
The abstract of the above publication states: “A new hidden Markov model (HMM) based frequency bandwidth extension algorithm using line spectral frequencies (HMM-LSF-FBE) is proposed. The proposed algorithm improves the performance of the traditional LSF-based extension algorithm by exploiting an HMM to indicate the proper representatives of different speech frames, and by applying a minimum mean square-criterion to estimate the high-band LSF values. The proposed algorithm has been tested and compared to the traditional LSF-based algorithm in terms of the perceptual evaluation of speech quality (PESQ) objective measure and speech spectrograms. Simulation results show that the proposed algorithm outperforms the traditional method by eliminating undesired whistling sounds completely. In addition, the bandwidth extended speech signals created by the proposed algorithm are significantly more pleasant to the human ear than the original narrowband speech signals from which they are derived.”
3. “Bandwidth extension of narrowband speech using cepstral analysis” Soon, I. Y.; Yeo, C. K.; Proceedings of Intelligent Multimedia, Video and Speech Processing, 2004. 20-22 Oct. 2004 Page(s): 242-245.
The abstract of the above publication states: “This paper describes a vector quantization based algorithm that extends the bandwidth of narrowband speech into wideband speech. Cepstral analysis is used to represent the spectral envelope information and the wideband excitation is generated using fallwave rectification with spectral whitening. Objective and subjective tests conducted show great improvement in speech quality over the original narrowband speech. The algorithm can be implemented as a postprocessor without the need for any side information.”
4. Feature selection for improved bandwidth extension of speech signals Jax, P.; Vary, P.; IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004. (ICASSP '04). Volume 1, 17-21 May 2004 Page(s): I-697-700 vol. 1.
The abstract of the above publication states: “The aim of artificial bandwidth extension (BWE) is to convert speech signals with “standard telephone” quality (frequencies up to 3.4 kHz) into 7 kHz wideband speech. The principal key to high quality BWE is the estimation of the spectral envelope of the wideband speech. In general, this estimation of the wideband spectral envelope is based on a number of features that are extracted from the narrowband input speech signal. We investigate potential features and evaluate their suitability for the BWE application. The quality of each feature is quantified in terms of the statistical measures of mutual information and separability. It turns out that the best BWE results are obtained by using a large feature “super-vector” which is subsequently reduced in dimension by a linear discriminant analysis. This solution also helps to reduce the computational complexity of the estimation of the wideband spectral envelope.”
5. Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model, Jax, P.; Vary, P.; IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. (ICASSP '03). 2003 Volume 1, 6-10 Apr. 2003 Page(s):I-680-I-683 vol. 1.
The abstract of the above publication states: “We present an algorithm to derive 7 kHz wideband speech from narrowband “telephone speech”. A statistical approach is used that is based on a hidden Markov model (HMM) of the speech production process. A new method for the estimation of the wideband spectral envelope is proposed, using nonlinear state-specific techniques to minimize a mean square error criterion. In contrast to common memoryless estimation methods, additional information from adjacent signal frames can be exploited by utilizing the HMM. A consistent advantage of the new estimation rule is obtained compared to previously published HMM-based hard or soft Classification.”
6. “Transformation of narrowband speech into wideband speech with aid of zero crossings rate”, Soon, I. Y.; Koh, S. N.; Yeo, C. K.; Ngo, W. H.; Electronics Letters, Volume 38, Issue 24, 21 Nov. 2002 Page(s): 1607-1608.
The abstract of the above publication states: “An innovative technique, for narrowband to wideband transformation of speech signals, is proposed. The zero crossings rate is used to adaptively control the gain of the synthesised upper band speech leading to significant performance improvement over an existing technique. Results are in fact comparable to more complex techniques. The technique can be implemented at the receiving end alone as it does not require any side information to be transmitted and can be easily implemented using finite impulse response digital filters.”
7. Narrowband to wideband conversion of speech using GMM based Transformation, Kun-Youl Park; Hyung Soon Kim; IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Volume 3, 5-9 Jun. 2000, Page(s): 1843-1846.
The abstract of the above publication states: “Reconstruction of wideband speech from its narrowband version is an attractive issue, since it can enhance the speech quality without modifying the existing communication networks. This paper proposes a new recovery method of wideband speech from narrowband speech. In the proposed method, the narrowband spectral envelope of input speech is transformed to a wideband spectral envelope based on the Gaussian mixture model (GMM), whose parameters are calculated by a joint density estimation technique. Then the lowband and highband speech signal is reconstructed by the LPC synthesizer using the reconstructed spectral envelope. This paper also proposes a codeword-dependent power estimation method. Both the objective and subjective test results shows that the proposed algorithm outperforms the conventional codebook mapping method.”
8. Avoiding over-estimation in bandwidth extension of telephony speech Nilsson, M.; Kleijn, W. B.; IEEE International Conference on Acoustics, Speech, and Signal Processing, 2001. (ICASSP '01). Volume 2, 7-11 May 2001 Page(s): 869-872.
The abstract of the above publication states: “We present a new way of treating the problem of extending a narrow-band signal to a wide-band signal. For many cases of bandwidth extension, the high-band energy is overestimated, leading to undesirable audible artifacts. To overcome these problems we introduce an asymmetric cost-function in the estimation process of the high-band that penalizes over-estimates more than under-estimates of the energy in the high-band. We show that the resulting attenuation of the estimated high-band energy depends on the broadness of the a-posteriori distribution of the energy given the extracted information about the narrow-band. Thus, the uncertainty about how to extend the signal at the high-band influences the level of extension. Results from a listening test show that the proposed algorithm produces less artifacts.”
9. A new technique for wideband enhancement of coded narrowband speech, Epps, J.; Holmes, W. H.; IEEE Workshop on Speech Coding Proceedings. 20-23 Jun. 1999, Page(s): 174-176.
The abstract of the above publication states: “Telephone speech is typically bandlimited to 4 kHz, resulting in a ‘muffled’ quality. Coding speech with a bandwidth greater than 4 kHz reduces this distortion, but requires a higher bit rate to avoid other types of distortion. An alternative to coding wider bandwidth speech is to exploit correlations between the 0-4 kHz and 4-8 kHz speech bands to re-synthesize wideband speech from decoded narrowband speech. This paper proposes a new technique for highband spectral envelope prediction, based upon codebook mapping with codebooks split by voicing. An objective comparison with several existing methods reveals that this new technique produces the smallest highband spectral distortion. Combined with a suitable highband excitation synthesis scheme, this envelope prediction scheme produces a significant quality improvement in speech that has been coded using narrowband standards.”
10. Wideband speech recovery from bandlimited speech in telephone communications, Yasukawa, H.; IEEE International Symposium on Circuits and Systems, 1998. ISCAS '98. Volume 4, 31 May-3 Jun. 1998 Page(s) 202-205, vol. 4.
The abstract of the above publication states: “This paper describes methods that can enhance the quality of speech signals that are severely band limited during regular telephone speech transmission. We have already proposed a spectrum widening method that utilizes aliasing in sampling rate conversion and digital filtering for spectrum shaping. This paper discusses the method using linear prediction. Speech components of the outbands of the received signal are basically generated by LPC (linear predictive coding) synthesis by analysis. Furthermore, we discuss a new spectrum widening method using a multilayer backpropagation neural network. It is shown that the proposed method has a good performance of recovering the wideband speech.”
The disclosures of all publications and patent documents mentioned in the specification, and of the publications and patent documents cited therein directly or indirectly, are hereby incorporated by reference.