Digital telephone systems have traditionally relied on standardized speech encoding and decoding procedures with fixed sampling rates in order to ensure compatibility between arbitrarily selected transmitter-receiver pairs. The evolution of second generation digital cellular networks and their functionally enhanced terminals has resulted in a situation where full one-to-one compatibility regarding sampling rates can not be guaranteed, i.e. the speech encoder in the transmitting terminal may use an input sampling rate which is different than the output sampling rate of the speech decoder in the terminal. Also the linear prediction or LP analysis of the original speech signal may be performed on a signal that has a narrower frequency band than the actual input signal because of complexity restrictions. The speech decoder of an advanced receiving terminal must be able to generate an LP filter with a wider frequency band than that used in the analysis, and to produce a wideband output signal from narrowband input parameters. The generation of a wideband LP filter from existing narrowband information has also wider applicability.
FIG. 1 illustrates a known principle for converting a narrowband encoded speech signal into a wideband decoded sample stream that can be used in speech synthesis with a high sampling rate. In the transmitting end an original speech signal has been subjected to low-pass filtering (LPF) in block 101. The resulting signal on a low frequency sub-band has been encoded in a narrowband encoder 102. In the receiving end the encoded signal is fed into a narrowband decoder 103, the output of which is a sample stream representing the low frequency sub-band with a relatively low sampling rate. In order to increase the sampling rate the signal is taken into a sampling rate interpolator 104.
The higher frequencies that are missing from the signal are estimated by taking the LP filter (not separately shown) from block 103 and using it to implement an LP filter as a part of a vocoder 105 which uses a white noise signal as its input. In other words, the frequency response curve of the LP filter in the low frequency sub-band is stretched in the direction of the frequency axis to cover a wider frequency band in the generation of a synthetically produced high frequency sub-band. The power of the white noise is adjusted so that the power of the vocoder output is appropriate. The output of the vocoder 105 is high-pass filtered (HPF) in block 106 in order to prevent excessive overlapping with the actual speech signal on the low frequency sub-band. The low and high frequency sub-bands are combined in the summing block 107 and the combination is taken to a speech synthesizer (not shown) for generating the final acoustic output signal.
We may consider an exemplary situation where the original sampling rate of the speech signal was 12.8 kHz and the sampling rate at the output of the decoder should be 16 kHz. The LP analysis has been performed for frequencies from 0 to 6400 Hz, i.e. from zero to the Nyquist frequency which is one half of the original sampling rate. Consequently the narrowband decoder 103 implements an LP filter the frequency response of which spans from 0 to 6400 Hz. In order to generate the high frequency sub-band, the frequency response of the LP filter is stretched in the vocoder 105 to cover a frequency band from 0 to 8000 Hz, where the upper limit is now the Nyquist frequency regarding the desired higher sampling rate.
A certain degree of overlap is usually desirable, although not necessary, between the low and high frequency sub-bands; the overlap may help to achieve optimal subjective audio quality. Let us assume that an overlap of 10% (i.e. 800 Hz) is aimed at. This means that in the narrowband decoder 103 the whole frequency response of 0 to 6400 Hz (i.e. 0-0.5Fs with the sampling rate Fs=12.8 kHz) of the LP filter is used, and in the vocoder 105 effectively only the frequency response of 5600 to 8000 Hz (i.e. 0.35Fs−0.5Fs with the sampling rate Fs=16 kHz) of the LP filter is used. Here “effectively” means that because of the high pass filter 106, the lower end of the frequency response does not have an effect on the output of the upper signal processing branch. The frequency response of the wideband LP filter in the range of 5600 to 8000 Hz is a stretched copy of the frequency response of the narrowband LP filter in the range of 4480 to 6400 Hz.
The drawbacks of the prior art arrangement become noticeable in a situation where the frequency response of the narrowband LP filter has a peak in its upper region, close to the original Nyquist frequency. FIG. 2 illustrates such a situation. The thin curve 201 represents the frequency response of a 0 to 8000 Hz LP filter which would be used in the analysis of a speech signal with a sampling rate 16 kHz. The thick curve 202 represents the combined frequency response that the arrangement of FIG. 1 would produce. The dashed lines 203 and 204 at 4480 Hz and 6400 Hz respectively delimit the portion of the frequency response of a narrowband LP filter that gets copied and stretched into the 5600 Hz to 8000 Hz interval in the wideband LP filter implemented in the vocoder. A peak at approximately 4400 Hz in the narrowband frequency response and the continuous downhill therefrom towards the upper limit of the frequency band cause the combined frequency response curve 202 to differ remarkably of the frequency response 201 of an ideal wideband LP filter.
Various prior art arrangements are known for complementing the principle of FIG. 1 to overcome the above-presented drawback. The patent publication U.S. Pat. No. 5,978,759 discloses an apparatus for expanding narrowband speech to wideband speech by using a codebook or look-up table. A set of parameters characteristic to the narrowband LP filter are extracted and taken as a search key to a look-up table so that the characteristic parameters of the corresponding wideband LP filter can be read from a matching or nearly matching entry in the look-up table. A similar solution is known from the patent publication number JP 10124089A. A slightly different approach is known from the patent publication number U.S. Pat. No. 5,455,888, where the higher frequencies are generated by using a filter bank which, however, is selected by using a kind of look-up table. The patent publication number U.S. Pat. No. 5,581,652 proposes the reconstruction of wideband speech from narrowband speech by using codebooks so that the waveform nature of the signals is exploited. Further in the published international patent application number WO 99/49454A1 there is disclosed a method where a speech signal is transformed into frequency domain, the characteristic peaks of the frequency domain signal are identified and a set of wideband filter parameters are selected on the basis of a conversion table.
The use of a look-up table in searching for the characteristics of a suitable wideband filter may help to avoid disasters of the kind shown in FIG. 2, but simultaneously it involves a considerable degree of inflexibility. Either only a limited number of possible wideband filters may be implemented or a very large memory must be allocated solely for this purpose. Increasing the number of stored wideband filter configurations to choose from also increases the time that must be allocated for searching for and setting up the right one of them, which is not desirable in real time operation like speech telephony.