1. Field of the Invention
The present invention generally relates to computer-implemented systems for processing data that includes mixed signals from multiple sources, and particularly to systems for adapting parameters to the data, classifying the data, and separating sources from the data.
2. Description of Related Art
Recently, blind source separation by ICA (Independent Component Analysis) has received attention because of its potential signal processing applications, such as speech enhancement, image processing, telecommunications, and medical signal processing, among others. ICA is a technique for finding a linear non-orthogonal coordinate system in multivariate data. The directions of the axes of the coordinate system are determined by the data""s second- and higher-order statistics. The separation is xe2x80x9cblindxe2x80x9d because the source signals are observed only as unknown linear mixtures of signals from multiple sensors, and the characteristic parameters of the source signals are unknown except that the sources are assumed to be independent. In other words, both the source signals and the way the signals are mixed is unknown. The goal of ICA is to learn the parameters and recover the independent sources (i.e., separate the independent sources) given only the unknown linear mixtures of the independent source signals as observed by the sensors. In contrast to correlation-based transformations such as principal component analysis (PCA), the ICA technique adapts a matrix to linearly transform the data and reduce the statistical dependencies of the source signals, attempting to make the source signals as independent as possible. ICA has proven a useful tool for finding structure in data, and has been successfully applied to processing real world data, including separating mixed speech signals and removing artifacts from EEG recordings.
U.S. Pat. No. 5,706,402, entitled xe2x80x9cBlind Signal Processing System Employing Information Maximization to Recover Unknown Signals Through Unsupervised Minimization of Output Redundancyxe2x80x9d, issued to Bell on Jan. 6, 1998, discloses an unsupervised learning algorithm based on entropy maximization in a single-layer feedforward neural network. In the ICA algorithm disclosed by Bell, an unsupervised learning procedure is used to solve the blind signal processing problem by maximizing joint output entropy through gradient ascent to minimize mutual information in the outputs. In that learned process, a plurality of scaling weights and bias weights are repeatedly adjusted to generate scaling and bias terms that are used to separate the sources. The algorithm disclosed by Bell separates sources that have supergaussian distributions, which can be described as sharply peaked probability density functions (pdfs) with heavy tails. Bell does not disclose how to separate sources that have negative kurtosis (e.g., uniform distribution).
In many real world situations the ICA algorithm cannot be effectively used because the sources are required to be independent (e.g. stationary), which means that the mixture parameters must be identical throughout the entire data set. If the sources become non-stationary at some point then the mixture parameters change, and the ICA algorithm will not operate properly. For example, in the classic cocktail party example where there are several voice sources, ICA will not operate if one of the sources has moved at some time during data collection because the source""s movement changes the mixing parameters. In summary, the ICA requirement that the sources be stationary greatly limits the usefulness of the ICA algorithm to find structure in data.
A mixture model is implemented in which the observed data is categorized into two or more mutually exclusive classes, each class being modeled with a mixture of independent components. The multiple class model allows the sources to become non-stationary. A computer-implemented method and apparatus is disclosed that adapts multiple class parameters in an adaptation algorithm for a plurality of classes whose parameters (i.e. characteristics) are initially unknown. In the adaptation algorithm, an iterative process is used to define multiple classes for a data set, each class having a set of mixing parameters including a mixing matrix Ak and a bias vector bk. After the adaptation algorithm has completed operations, the class parameters and the class probabilities for each data vector are known, and data is then assigned to one of the learned mutually exclusive classes. The sources can now be separated using the source vectors calculated during the adaptation algorithm. Advantageously, the sources are not required to be stationary throughout the data set, and therefore the system can classify data in a dynamic environment where the mixing parameters change without notice and in an unknown manner. The system can be used in a wide variety of applications such as speech processing, image processing, medical data processing, satellite data processing, antenna array reception, and information retrieval systems. Furthermore, the adaptation algorithm described herein is implemented in one embodiment using an extended infomax ICA algorithm, which provides a way to separate sources that have a non-Gaussian (e.g., platykurtic or leptokurtic) structure.
A computer-implemented method is described that adapts class parameters for a plurality of classes and classifies a plurality of data vectors having N elements that represent a linear mixture of source signals into said classes. The method includes receiving a plurality of data vectors from data index t=1 to t=T, initializing parameters for each class, including the number of classes, the probability that a random data vector will be in class k, the mixing matrix for each class, and the bias vector for each class. In a main adaptation loop, for each data vector from data index t=1 to t=T, steps are performed to adapting the class parameters including the mixing matrices and bias vectors for each class. The main adaptation loop is repeated a plurality of iterations while observing a learning rate at each subsequent iteration, and after observing convergence of said learning rate, then assigning each data vector to one of said classes. The source vectors, which are calculated for each data vector and each class, can then be used to separate source signals in each of said classes. In one embodiment, the mixing matrices are adapted using an extended infomax ICA algorithm, so that both sub-Gaussian and super-Gaussian sources can be separated.
A method is also described in which a plurality of data vectors are classified using previously adapted class parameters. The class probability for each class is calculated and each data vector is assigned to one of the previously adapted class. This classification algorithm can be used, for example to compress images or to search an image for a particular structure or particular types of structure.
The method can be used in a variety of signal processing applications to find structure in data, such as image processing, speech recognition, and medical data processing. Other uses used include image compression, speech compression, and classification of images, speech, and sound.