The present invention relates to enhancing the quality of speech in a noisy telecommunications channel when networked and particularly to an apparatus which enhances the speech by measuring the noise from the speech portions of the transmission itself and then removing the detected noise.
In all forms of voice communication systems, noise from a variety of causes can interfere with the user's communications. Corrupting noise can occur with speech at the input of a system, in the transmission path(s), and at the receiving end. The presence of noise is annoying or distracting to users, can adversely affect speech quality, and can reduce the performance of speech coding and speech recognition apparatus.
Noise in the transmission path is particularly difficult to overcome, one reason being that the noise signal is not ascertainable from its source. Therefore, suppressing it cannot be accomplished by generating an "error" signal from a direct measurement of the noise and then canceling out the error signal by phase inversion.
Various approaches to enhancing a noisy speech signal when the noise component is not directly observable have been attempted. A review of these techniques is found in "Enhancement and Bandwidth Compression of Noisy Speech," by J. S. Lim and A. V. Oppenheim, Proceedings of the IEEE, Vol. 67, No. 12, December 1979, Section V, pages 1586-1604. These include spectral subtraction of the estimated noise amplitude spectrum from the whole spectrum computed for the available noisy signal, and an interactive model-based filter proposed by Lim and Oppenheim which attempts to find the best all-pole model of the speech component given the total noisy signal and an estimate of the noise power spectrum. The model-based approach was used in "Constrained Iterative Speech Enhancement with Application to Speech Recognition," by J. H. L. Hansen and M. A. Clements, IEEE Transactions On Signal Processing, Vol. 39, No. 4, Apr. 1991, pages 795-805, to develop a non-real-time speech smoother, where additional constraints were imposed on the method of Lim/Oppenheim during the iterations to limit the model to maintain characteristics of speech.
Many noise detection techniques rely on detecting noise in the gaps between speech where the noise is the prominent signal. Thus, these techniques are easily employed in transmission systems in which both speech and gaps generated at the sender's end traverse the system. However, in the context of transmission systems that employ Call Multiplication Equipment, such as in satellite transmission systems, a unique problem arises. CME transmissions involve the sending of speech portions only. The gap portions are stripped away from the original signal by a speech detection algorithm. It is necessary to eliminate the gaps so as to maximize the use of the available bandwidth in the satellite arena. Thus, at the receiving end of the long distance transmission, the original speech gaps which contained useful noise information, and which were commonly used for measuring noise to be filtered from the speech portions, are no longer in existence. Instead, the receiving equipment inserts a different noise, referred to as fill noise. This fill noise adds an additional level of complexity to the noise measurement problem.
Therefore, it is desirable in the context of transmission systems where only speech portions are transmitted, to measure and filter out noise so as to improve the quality of speech at the receiving terminal.