This application relates to signal processing and systems and methods for separation of source signals using a blind signal separation process.
In recent years, new technologies have brought to light problems with non-linearity, uncertainty, noise and cross channel mixing, compounded by the very limited knowledge available about the data production mechanisms. To deal with recovering original source signals from observed signals without knowing the mixing process, so called blind source separation (BSS), has attracted attention in the field. These signal sources may be, for example, acoustic sources, spectral sources, image sources, data sources, or physiology or medical sources. Part of the allure of BSS is that it has many practical uses, including, but not limited to, communication such as speech enhancement for robust speech recognition, multimedia such as crosstalk separation in telecommunication, use in high-quality hearing aid equipment, analysis of biological/physiological signals such as electrocardiograph (EKG), magnetic resonance (MRI/MRS), electroencephalographs (EEG) and magnetoencephalographs (MEG), data/sensor fusion, and the like. A fundamental requirement for conventional BSS application is that the source signals should be statistically independent. BSS also requires multiple sensors, transducers, or microphones to capture the signals. In many cases, for each independent source, an additional sensor is required. For example, a BSS speech separation process for separating two independent signal sources will require at least two microphones.
One form of BSS is Independent component analysis (ICA). ICA is a conventional method used to separate statistically independent sources from mixtures of sources by utilizing higher-order statistics. The application of ICA to independent signal sources is well known, and has been document, for example, in T.-W. Lee, Independent Component Analysis: Theory and Applications. Boston: Kluwer Academic Publishers, 1998. In its simplest form, the ICA model assumes linear, instantaneous mixing without sensor noise, and the number of sources are equal to the number of sensors. However, when trying to solve the problem of separating acoustic source signals mixed in an environment, those assumptions may not be applicable, and are thus not valid, and model extensions are needed. In this way, the application of standard ICA to real-world signal environments is prone to errors, and may require substantial post processing to adequately separate signals.
In one typical application, ICA may be applied to separate signal sources in a broad range of directions spanning areas of signal processing, neural networks, machine learning, data/sensor fusion and communication, including for example, to separate a person's speech from a noise source. In such a real-world environment, the acoustic signal sources are not instantaneous mixtures of the sources, but convolutive mixtures, which means that they are mixed with time delays and convolutions. Accordingly, the conventional ICA assumptions are not present, and the resulting signal separation may be unsatisfactory. In order to deal with such convolved mixtures, the ICA model formulation and the learning algorithm have been extended to convolutive mixtures in both the time and the frequency domains. These extensions have been discussed, for example, in T.-W. Lee, A. J. Bell, and R. Lambert, Blind separation of convolved and delayed sources, Adv. Neural Information Processing Systems, 1997, pp. 758-764. Those models are known as solutions to the multichannel blind deconvolution problem. In case of the time domain approach, solutions usually require intensive computations with long de-reverberation filters, and the resulting unmixed source signals are whitened due to the i.i.d. assumption. Slow convergence speed, especially for colored input signals such as speech signals, have been observed, and therefore may not prove effective or practical in real acoustic environments. The computational load and slow convergence can be overcome by the frequency domain approach, in which multiplication at each frequency bin replaces convolution operation in the time domain. Thus, the ICA algorithm may be applied to instantaneous mixtures in each frequency bin.
Although this may be attractive from a computational standpoint, this process can suffer from a permutation problem and other technical difficulties. Permutation results from a failure of the ICA process to place one source in a determined set of frequency bins. That is, any bin may hold a frequency component from any one of the signal sources. Accordingly, when the bins are used to generate a resulting time domain signal, the resulting signal may have certain frequency components from an incorrect source. Hence, a significant problem is the permutation of the ICA solutions over different frequency bins due to the indetermination of permutation inherent in the ICA algorithm. To address this, the process would need to correct the permutations of separating matrices at each frequency so that the separated signal in the time domain is reconstructed properly. Several solutions have been proposed to solve this permutation problem, but none has proven satisfactory in practical application.
Various approaches have been proposed to solve the permutation problem. One known approach is to impose a smoothness constraint of the source that translates into smoothing the separating filter. This approach has been realized by several techniques such as averaging separating matrices with adjacent frequencies (see, P. Smaragdis, Blind separation of convolved mixtures in the frequency domain, Neurocomputing, vol. 22), limiting the filter length in the time domain (see, L. Parra and C. Spence, Convolutive blind separation of non-stationary sources, vol. 8, no. 3, pp. 320-327, 2000), or considering the coherency of separating matrices at adjacent frequencies (see, F. Asano, S. Ikeda, M. Ogawa, H. Asoh, and N. Kitawaki, A combined approach of array processing and independent component analysis for blind separation of acoustic signals, in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 2001, pp. 2729-2732.)
Another known approach is based on direction of arrival (DOA) estimation which is much used in array signal processing. By analyzing the directivity patterns formed by a separating matrix, source directions can be estimated and therefore permutations can be aligned. Such a process is more fully described in S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, and F. Itakura, Evaluation of blind signal separation method using directivity pattern under reverberant conditions, in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 2000, pp. 3140-3143. When the sources are colored signals, it is possible to employ the inter-frequency correlations of signal envelopes to align permutations, as described, for example, in J. Anemuller and B. Kollmeier, Amplitude modulation decorrelation for convolutive blind source separation, in Proc. Int. Conf. on Independent Component Analysis and Blind Source Separation, 2000, pp. 215-220. These methods may perform well under certain specific conditions but may have degraded performance under different conditions. Moreover, in the case of an ill-posed problem, e.g., the case that each mixing filter of the source is similar, the sources are located close to each other, or DOA of the sources are similar, various methods developed so far fail to separate the source signals.
Thus, there is a need for robust and versatile techniques to separate components from observed signals into various desired components.