In the art of speech processing, it is necessary in some circumstances to separate a mixture of differently convolved and mixed signals—where those signals typically emanate from multiple sources—without a priori knowledge of those signals. Such a separation of a composite signal into its constituent component signals is known as blind source separation (BSS), and various BSS techniques are known in the art. These techniques are useful for separating source signals that are simultaneously produced by independent sources—e.g., multiple speakers, sonar arrays, and the like, and which signals are combined in a convolutive medium. BSS techniques may be applied in such applications as speech detection using multiple microphones, crosstalk removal in multichannel communications, multipath channel identification and equalization, direction of arrival (DOA) estimation in sensor arrays, improvement of beam forming microphones for audio and passive sonar, and discovery of independent source signals in various biological signals, such as EEG, MEG and the like.
Most of the known BSS algorithms try to invert a multi-path acoustic environment by finding a multi-path finite impulse response (FIR) filter that approximately inverts the forward channel. However, a perfect inversion may require fairly long FIR filters—such situations particularly occurring in strongly echoic and reverberant rooms where most, if not all, current algorithms fail. Additionally, changing forward channels due to moving sources, moving sensors, or changing environments require an algorithm that converges sufficiently quickly to maintain an accurate current inverse of the channel.
Two general types of BSS algorithms are known in the art for blind source separation of a convolutive mixture of broad-band signals: (1) algorithms that diagonalize a single estimate of the second order statistics, and (2) algorithms that identify statistically independent signals by considering higher order statistics. Algorithms of the first type generate decorrelated signals by diagonalizing second order statistics and have a simple structure that can be implemented efficiently. [See, e.g., E. Weinstein, M. Feder, and A. V. Oppenheim, “Multi-Channel Signal Separation by Decorrelation”, IEEE Trans. Speech Audio Processing, vol. 1, no. 4, pp. 405-413, April 1993; S. Van Gerven and D. Van Compernolle, “Signal Separation by Symmetric Adaptive Decorrelation: Stability, Convergence, and Uniqueness”, IEEE Trans. Signal Processing, vol. 43, no. 7, pp. 1602-1612, July 1995; K.-C. Yen and Y. Zhao, “Improvements on co-channel separation using ADF: Low complexity, fast convergence, and generalization”, in Proc. ICASSP 98, Seattle, Wash., 1998, pp. 1025-1028; M. Kawamoto, “A method of blind separation for convolved non-stationary signals”, Neurocomputing, vol. 22, no. 1-3, pp. 157-171, 1998; S. Van Gerven and D. Van Compernolle, “Signal separation in a symmetric adaptive noise canceler by output decorrelation”, in Proc. ICASSP 92, 1992, vol. IV, pp. 221-224.] However, they are not guaranteed to converge to the right solution, as single decorrelation is not a sufficient condition to obtain independent model sources. Instead, for stationary signals higher order statistics have to be considered, either by direct measurement and optimization of higher order statistics [See, e.g., D. Yellin and E. Weinstein, “Multichannel Signal Separation: Methods and Analysis”, IEEE Trans. Signal Processing, vol. 44, no. 1, pp. 106-118, 1996; H.-L. N. Thi and C. Jutten, “Blind source separation for convolutive mixtures”, Signal Processing, vol. 45, no. 2, pp. 209-229, 1995; S. Sharnsunder and G. Giannakis, “Multichannel Blind Signal Separation and Reconstruction”, IEEE Trans. Speech Audio Processing, vol. 5, no. 6, pp. 515-528, Nov. 997], or indirectly by making assumptions on the shape of the cumulative density function (cdf) of the signals [See, e.g., R. Lambert and A. Bell, “Blind Separation of Multiple Speakers in a Multipath Environment”, in Proc. ICASSP 97, 1997, pp. 423-426; S. Amari, S. C. Douglas, A. Cichocki, and A. A. Yang, “Multichannel blind deconvolution using the natural gradient”, in Proc. 1st IEEE Workshop on Signal Processing App. Wireless Comm., 1997, pp. 101-104; T. Lee, A. Bell, and R. Lambert, “Blind separation of delayed and convolved sources”, in Proc. Neural Information Processing Systems 96, 1997]. The former methods are fairly complex and difficult to implement. The latter methods fail in cases where the assumptions on the cdf are not accurate.
A limited body of on-line BSS algorithms is also known, generally for the case of single decorrelation and for indirect higher order methods, and having the same limitations as their off-line counterparts. [See, e.g., E. Weinstein, M. Feder, and A. V. Oppenheim, “Multi-Channel Signal Separation by Decorrelation”, IEEE Trans. Speech Audio Processing, vol. 1, no. 4, pp. 405-413, April 1993; S. Van Gerven and D. Van Compernolle, “Signal Separation by Symmetric Adaptive Decorrelation: Stability, Convergence, and Uniqueness”, IEEE Trans. Signal Processing, vol. 43, no. 7, pp. 1602-1612, July 1995; K.-C. Yen and Y. Zhao, “Improvements on co-channel separation using ADF: Low complexity, fast convergence, and generalization”, in Proc. ICASSP 98, Seattle, Wash., 1998, pp. 1025-1028; K-C Yen and Y. Zhao, “Adaptive Co-Channel Speech Separation and Recognition”, IEEE Trans. Signal Processing, vol. 7, no. 2, March 1999; S. Amari, C. S. Douglas, A. Cichocki, and H. H. Yang, “Novel On-Line Adaptive Learning Algorithms for Blind Deconvolution Using the Natural Gradient Approach”, in Proc. 11th IFAC Symposium on System Identification, Kitakyushu City, Japan, July 1997, vol. 3, pp. 1057-1062; P. Smaragdis, “Blind separation of convolved mixtures in the frequency domain”, Neurocomputing, vol. 22, pp. 21-34, 1998.] Single decorrelation algorithms have also been described that purport to operate on non-stationary signals [See, e.g., M. Kawamoto, “A method of blind separation for convolved non-stationary signals”, Neurocomputing, vol. 22, no. 1-3, pp. 157-171, 1998; T. Ngo and N. Bhadkamkar, “Adaptive blind separation of audio sources by a physically compact device using second-order statistics”, in ICA'99, Loubaton Cardoso, Jutten, Ed., 1999, pp. 257-260; H. Sahlin and H. Broman, “Separation of real-world signals”, Signal Processing, vol. 64, pp. 103-104, 1998]. However, the art is not believed to provide an on-line BSS method which yields fast convergence for non-static filters—an essential criteria, since the data may be visited only once.
Therefore, there is a need for a blind source separation technique that accurately and quickly performs convolutive signal decorrelation.