The present invention relates generally to an analysis/synthesis-based microphone array speech enhancer with variable signal distortion.
This invention addresses the problem of enhancing speech that has been corrupted by several interference signals and/or additive background noise. By speech enhancement is meant the suppressing of additive background noise and/or interference, interference which arises in many applications including hands-free mobile telephony, aircraft cockpit communications, and computer speech-to-text devices.
The speech enhancement problem considered has five distinguishing features. First, a speech enhancement algorithm is wanted, an algorithm that is robust to a wide range of interference and noise scenarios. There is motivation here by the success of the human auditory system in suppressing interference and noise in many adverse environments. Second, a priori knowledge of the interference and noise environment is not assumed. This means that a statistical model for the noise is not assumed as is done in many speech enhancement techniques. Third, we are especially interested in very noisy scenarios; very noisy scenarios offer the greatest potential for improvement in speech quality from the use of speech enhancement algorithms. Fourth, some degradation of the desired signal is permitted in exchange for additional interference and noise suppression, since the human auditory system can withstand some degradation of the desired signal. The amount of signal degradation that is tolerated depends on the input signal-to-noise ratio at the array inputs-more signal degradation is tolerated in very noisy scenarios. Fifth, it is assumed that there are outputs from K microphones available for processing, where K is small. Only small numbers of microphones are considered for two reasons. The first reason is that, for many applications, either there is not space for a large array or the cost cannot be justified for a large number of microphones and the necessary processing hardware. The second reason is that the human auditory system uses only two ears, yet it performs well in a wide range of adverse environments. K=2 is considered for most of my work. While it is not a goal to design an array processing structure that is an accurate physiological or psychoacoustical model of auditory processing, we are nevertheless motivated by the success of the human auditory system to consider binaural processing for speech enhancement.
The following publications are of interest.
[1b] J. B. Allen, D. A. Berkley, and J. Blauert, "Multimicrophone signal-processing technique to remove room reverberation from speech signals," Journal of the Acoustical Society of America, vol. 62, pp. 912-915, October 1977. PA1 [2b] P. J. Bloom and G. D. Cain, "Evaluation of two-input speech dereverberation techniques," in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, (Paris, France), pp. 164-167, May 1982. PA1 [3b] S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 27, pp. 113-120, April 1979. Reprinted in Speech Enhancement, J. S. Lim, ed., Englewood Cliffs, N.J.: Prentice-Hall, 1983. PA1 [4b] R. A. Mucci, "A comparison of efficient beamforming algorithms," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, pp. 548-558, June 1984. PA1 [5b] S. S. Narayan, A. M. Peterson, and M. J. Narasimha, "Transform domain LMS algorithm," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 31, pp. 609-615, June 1983. PA1 [6b] Y. Kaneda and J. Ohga, "Adaptive microphone-array system for noise reduction," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 34, pp. 1391-1400, December 1986. PA1 [7b] B. Van Veen, "Minimum variance beamforming with soft response constraints," IEEE Transactions on Signal Processing, vol. 39, pp. 1964-1972, September 1991. PA1 [8b] O. L. Frost, III, "An algorithm for linearly constrained adaptive array processing," Proceedings of the IEEE, vol. 60, pp. 926-935, August 1972.