1. Field of the Invention
The present invention relates generally to speech coding. More particularly, the present invention relates to speech post-processing.
2. Background Art
Speech compression may be used to reduce the number of bits that represent the speech signal thereby reducing the bandwidth needed for transmission. However, speech compression may result in degradation of the quality of decompressed speech. In general, a higher bit rate will result in higher quality, while a lower bit rate will result in lower quality. However, modern speech compression techniques, such as coding techniques, can produce decompressed speech of relatively high quality at relatively low bit rates. In general, modern coding techniques attempt to represent the perceptually important features of the speech signal, without preserving the actual speech waveform. Speech compression systems, commonly called codecs, include an encoder and a decoder and may be used to reduce the bit rate of digital speech signals. Numerous algorithms have been developed for speech codecs that reduce the number of bits required to digitally encode the original speech while attempting to maintain high quality reconstructed speech.
FIG. 1 illustrates conventional speech decoding system 100, which includes excitation decoder 110, synthesis filter 120 and post-processor 130. As shown, decoding system 100 receives encoded speech bitstream 102 over a communication medium (not shown) from an encoder, where decoding system 100 may be part of a mobile communication device, a base station or other wireless or wireline communication device that is capable of receiving encoded speech bitstream 102. Decoding system 100 operates to decode encoded speech bitstream 102 and generate speech signal 132 in the form of a digital signal. Speech signal 132 may then be converted to an analog signal by a digital-to-analog converter (not shown). The analog output of the digital-to-analog converter may be received by a receiver (not shown) that may be a human ear, a magnetic tape recorder, or any other device capable of receiving an analog signal. Alternatively, a digital recording device, a speech recognition device, or any other device capable of receiving a digital signal may receive speech signal 132.
Excitation decoder 110 decodes encoded speech bitstream 102 according to the coding algorithm and bit rate of encoded speech bitstream 102, and generates decoded excitation 112. Synthesis filter 120 may be a short-term inverse prediction filter that generates synthesized speech 122 based on decoded excitation 112. Post-processor 130 may include filtering, signal enhancement, noise modification, amplification, tilt correction and other similar techniques capable of improving the perceptual quality of synthesized speech 122. Post-processor 130 may decrease the audible noise without noticeably degrading synthesized speech 122. Decreasing the audible noise may be accomplished by emphasizing the formant structure of synthesized speech 122 or by suppressing the noise in the frequency regions that are perceptually not relevant for synthesized speech 122.
Conventionally, post-processing of synthesized speech 122 is performed in the time domain using available LPC (Linear Prediction Coding) parameters. However, when such LPC parameters are not available, it is too costly, in terms of complexity and code size, to generate LPC parameters for the purpose of post-processing of synthesized speech 122. This is especially true for wideband post-processing of synthesized speech 122. Accordingly, there is a strong need in the art for a decoder post-processor that can perform efficiently and effectively without utilizing time domain post-processing based on LPC parameters.