1. Field of the Invention
This invention relates to signal processing systems and, more particularly, to systems for reducing room reverberation and noise effects in audio systems such as those employed in "hands free telephony."
2. Description of the Prior Art
It is well known that room reverberation can significantly reduce the perceived quality of sounds transmitted by a monaural microphone to a monaural loudspeaker. This quality reduction is particularly disturbing in conference telephony where the nature of the room used is not generally well controlled and where, therefore, room reverberation is a factor.
Room reverberations have been heuristically separated into two categories: early echoes, which are perceived as spectral distortion and their effect is known as "coloration," and longer term reverberations, also known as late reflections or late echoes, which contribute time-domain noise-like perceptions to speech signals. An excellent discussion of room reverberation principles and of the methods used in the art to reduce the effects of such reverberation is presented in "Seeking the Ideal in `Handsfree` Telephony," Berkley et al, Bell Labs Record, November 1974, page 318, et seq. Therein, the distinction between early echo distortion and late reflection distortion is discussed, together with some of the methods used for removing the different types of distortion. Some of the methods described in this article, and other methods which are pertinent to this disclosure, are organized and discussed below in accordance with the principles employed.
In U.S. Pat. No. 3,786,188, issued Jan. 15, 1974, I described a system for synthesizing speech from a reverberant signal. In that system, the vocal tract transfer function of the speaker is continuously approximated from the reverberant signal, developing thereby a reverberant excitation function. The reverberant excitation function is analyzed to determine certain of the speaker's parameters (such as whether the speaker's function is voiced or unvoiced), and a nonreverberant speech signal is synthesized from the derived parameters. This synthesis approach necessarily makes approximations in the derived parameters, and those approximations, coupled with the small number of parameters, cause some fidelity to be lost.
In "Signal Processing to Reduce Multipath Distortion in Small Rooms," The Journal of the Acoustics Society of America, Vol. 47, No. 6, (Part I), 1970, pages 1475 et seq, J. L. Flanagan et al describe a system for reducing early echo effects by combining the signals from two or more microphones to produce a single output signal. In accordance with the described system, the output signal of each microphone is filtered through a number of bandpass signals occupying contiguous frequency ranges, and the microphone receiving greatest average power in a given frequency band is selected to contribute that signal band to the output. The term "contiguous bands" as used in the art and in the context of this disclosure refers to nonoverlapping bands. This method is effective only for reducing early echoes.
In U.S. Pat. No. 3,794,766, issued Feb. 26, 1974, Cox et al describe a system employing a multiplicity of microphones. Signal improvement is realized by equalizing the signal delay in the paths of the various microphones, and the necessary delay for equalization is determined by time-domain correlation techniques. This system operates in the time domain and does not account for different delays at different frequency bands.
In U.S. Pat. No. 3,662,108, issued on May 9, 1972, to J. L. Flanagan, a system employing cepstrum analyzers responsive to a plurality of microphones is described. By summing the output signals of the analyzers, the portions of the cepstrum signals representing the undistorted acoustic signal cohere, while the portions of the cepstrum signals representing the multipath distorted transmitted signals do not. Selective clipping of the summed cepstrum signals eliminates the distortion components, and inverse transformation of the summed and clipped cepstrum signals yields a replica of the original nonreverberant acoustic signal. In this system, again, only early echoes are corrected.
Lastly, in U.S. Pat. No. 3,440,350, issued Apr. 22, 1969, J. L. Flanagan describes a system for reducing the reverberation impairment of signals by employing a plurality of microphones, with each microphone being connected to a phase vocoder. The phase vocoder of each microphone develops a pair of narrow band signals in each of a plurality of contiguous narrow analyzing bands, with one signal representing the magnitude of the short-time Fourier transform, and the other signal representing the phase angle derivative of the short-time Fourier transform. The plurality of phase vocoder signals are averaged to develop composite amplitude and phase signals, and the composite control signals of the plurality of phase vocoders are utilized to synthesize a replica of the nonreverberant acoustic signal. Again, in this system only early echoes are corrected.
In all of the techniques described above, the treatment of early echoes and late echoes is separate, with the bulk of the systems attempting to remove mostly the early echoes. What is needed, then, is a simple approach for removing both early and late echoes.