This invention relates to postfilters for postfiltering audio signals, especially speech signals and to methods of postfiltering these signals. More specifically, it relates to short-delay postfilters for postfiltering audio signals, especially speech signals and to methods of postfiltering these signals with a short-delay postfilter.
Postfilters are generally used to mask noise in speech signals by enhancing strong spectral parts and/or by suppressing weak regions in the signals. For example, such noise may arise in the case where analogue speech signals are sampled for encoding into a digital representation, as happens before transmission of the speech signals in a mobile telecommunications system, or during subsequent decoding of a previously encoded signal. Very often, such encoding or decoding will also involve compression of the signal data during the encoding procedure, with subsequent decompression, as appropriate, during decoding. The loss of some information contained in original analogue audio signal is therefore inevitable in the case of compression and decompression, and the application of a postfilter to improve the perceived quality of the decoded signals is desirable. A postfilter may be applied to the encoded audio signals, to the decoded audio signals, or to both to achieve this improvement.
Three main types of postfilter may be distinguished. These are known respectively as short-delay (or short-term) postfilters, long-delay (or long-term) postfilters and high-frequency emphasis (or high-pass) postfilters. Short-delay postfilters generally work by enhancing regions of the frequency spectrum of an audio signal in which there is much energy, in order to decrease distortion in the valleys of the frequency spectrum. Long-delay postfilters generally work by enhancing regions of the frequency spectrum showing long-term periodicity corresponding to the pitch or audio frequency of the original signal. High-frequency emphasis postfilters are used to enhance high frequency regions of a signal frequency spectrum, and hence to restore brightness to the signal, since low frequency regions are generally amplified more in relation to high frequency regions during coding and decoding. A high-frequency emphasis postfilter may also be used to compensate for high-frequency losses created by the application of a short-delay postfilter. The three main types of postfilter just described may be applied individually to audio signals, or in a combination of two of the three types of postfilter, or in a combination of all three types together for optimal improvement in the perceived quality of the audio signals.
As mentioned above, the present invention relates specifically to short-delay postfilters and to methods of postfiltering audio signals, especially speech signals with short-delay postfilters. The effect of a short-delay postfilter upon an audio signal may be represented by a transfer function P(z) expressed in terms of filter coefficients and the variable z, where z is the inverse of the unit delay operator z.sup.-1 used in the z-transform representation of transfer functions. Furthermore, a production filter for generating coded audio signals may be represented by a transfer function H(z) also expressed in terms of filter coefficients and the variable z. As shown in the accompanying figure (FIGURE), in generating a coded audio signal, an excitation generator 11 is used to provide an excitation signal E(z) toga production filter 12. The production filter 12 transforms the excitation signal E(z) into a synthetic audio signal S(z) according to the transfer function H(z) of the production filter. As also shown in the FIGURE, the synthetic audio signal S(z) thus produced may subsequently be supplied, either immediately or following transmission and decoding, to a postfilter 13, which transforms the synthetic audio signal S(z) according to the transfer function P(z) of the postfilter to generate a postfiltered audio signal Sp(z).
The transfer function H(z) of the production filter 12 is often of the type : EQU H(z)=1/A(z) [Eqn. 1]
where A(z) is a polynomial expressible as: ##EQU1## where m is an index ranging from 1 up to M.sub.a, the order of the polynomial, a.sub.m are the coefficients of the polynomial and z is the variable, as before. M.sub.a, the order of the polynomial, is typically from 8 to 10.
U.S. Pat. No. 4,969,192, assigned to Voicecraft, Inc. of Goleta, Calif., USA, describes using the same polynomial A(z) of Eqn. 2 used in the production filter 12 to provide the denominator and the numerator of a transfer function P(z) for the short-delay postfilter 13. Accordingly, the denominator term of such a transfer function emphasizes the formants in the frequency spectrum of the synthetic audio signal S(z) provided by the production filter, whilst attenuating the valleys in the frequency spectrum, as desired. Being of the same form as the denominator term, the numerator term of such a short-delay transfer function aims to cancel out the overall shape of the frequency spectrum resulting from the denominator term.
In U.S. Pat. No. 4,969,192, the denominator and numerator terms of the short-delay transfer function P(z) are modified from the polynomial A(z) of the corresponding production filter transfer function by respective chirp factors, which are empirically determined parameters, .alpha. and .beta., thus: EQU P(z)=A.sub.P (z/.beta.)/A.sub.P (z/.alpha.) [Eqn. 3]
where .alpha. and .beta. are defined by 0&lt;.alpha.&lt;.beta.&lt;1. These chirp factors, .alpha. and .beta., may accordingly be used to move the poles and zeros of the transfer function of Eqn. 3 towards the origin. Setting .alpha. or .beta.=1 makes the denominator or numerator term, respectively, the same as A(z), whilst setting .alpha.=0 results in an all-pass postfilter. The short-delay transfer function of Eqn. 3 provides some trade-off between spectral peaks so sharp as to produce readily perceptible and hence undesirable chirping and so low as not to achieve any noise reduction at all. U.S. Pat. No. 4,969,192 therefore suggests using values for .alpha. and .beta. of .alpha.=0.8 and .beta.=0.5 to achieve a compromise between these two extremes, whereby spectral tilt introduced by the denominator term is partially canceled by the numerator term. However, filtered audio signals resulting from the transfer function of Eqn. 3 remain muffled, requiring a high-frequency emphasis filter to compensate for the high-frequency losses introduced by a short-delay postfilter having such a transfer function. Moreover, since the numerator polynomial of Eqn. 3 does not track the denominator polynomial precisely, the overall spectral tilt of the short-delay postfilter wanders over time, producing a perceived variation in the postfiltered signal brightness.
U.S. Pat. No. 5,241,650 assigned to Motorola Inc. of Schaumberg, Ill., attempts to improve upon the short-delay postfilter transfer function of U.S. Pat. No. 4,969,192 described in Eqn. 3, above. The short-delay postfilter transfer function described in U.S. Pat. No. 5,241,650 uses the same denominator term as in the transfer function of the corresponding production filter, but in contrast to U.S. Pat. No. 4,969,192, the numerator term is derived from the denominator term by (a) transforming the denominator term to an alternate domain set of parameters, (b) operating on the alternate domain set of parameters to provide a set of coefficients, and then (c) using this set of coefficients to provide the numerator term. In one embodiment of U.S. Pat. No. 5,241,650, the denominator term is transformed into the autocorrelation domain. In this alternate domain, a spectral smoothing technique making use of a bandwidth expansion function is used to operate on the autocorrelation sequence of the filter coefficients, before the set of coefficients for the numerator term is then provided from the operated-on autocorrelation sequence via the Levinson recursion.
U.S. Pat. No. 5,241,650 describes how the numerator term may alternatively be derived directly from the transfer function of the corresponding production filter via the same procedure, rather than from the denominator term of the short-delay postfilter, but since the denominator term only differs from the polynomial used in the production filter by a chirp factor, the effect is the same. The result in both cases is that the numerator polynomial is a spectrally smoothed version of the denominator polynomial, Ap(z/.alpha.).
The short-delay postfilter described in U.S. Pat. No. 5,241,650 is used in the Personal Digital Cellular (PDC) telecommunications system, as described in the PDC telecommunications system RCR standard, "RCR STD-27" of the Research and Development Centre for Radio Systems (RCR) of June 1995. It is also used in mobile telecommunications systems conforming to the IS-54 standard, as described in "Cellular System: Dual-Mode Mobile Station-Base Station Compatibility Standard IS-54" of the Electronic Industries Association (EIA) of December 1989.
Although a short-delay postfilter according to U.S. Pat. No. 5,241,650 improves upon the time-varying spectral tilt of a short-delay postfilter according to U.S. Pat. No. 4,969,192 by providing a numerator polynomial for the short-delay postfilter transfer function which is a spectrally smoothed version of the denominator polynomial therein, the problem still remains that since the numerator term in U.S. Pat. No. 5,241,650 is derived either from the denominator of the same transfer function or from the transfer function of the corresponding production filter, the spectral slope of the postfiltered audio signal may still change too abruptly to eliminate perceptible modulations in the brightness of the postfiltered signal.