1. Technical Field
This application relates generally to speech and pattern recognition and, more specifically, to multi-category (or class) classification of an observed multi-dimensional predictor feature, for use in pattern recognition systems.
2. Description of Related Art
In one conventional method for pattern classification and classifier design, each class is modeled as a gaussian, or a mixture of gaussian, and the associated parameters are estimated from training data. As is understood, each class may represent different data depending on the application. For instance, with speech recognition, the classes may represent different phonemes or triphones. Further, with handwriting recognition, each class may represent a different handwriting stroke. Due to computational issues, the gaussian models are assumed to have a diagonal co-variance matrix. When classification is desired, a new observation is applied to the models within each category, and the category, whose model generates the largest likelihood is selected.
In another conventional design, the performance of a classifier that is designed using gaussian models is enhanced by applying a linear transformation of the input data, and possibly, by simultaneously reducing the feature dimension. More specifically, conventional methods such as Principal Component Analysis, and Linear Discriminant Analysis may be employed to obtain the linear transformation of the input data. Recent improvements to the linear transform techniques include Heteroscedastic Discriminant Analysis and Maximum Likelihood Linear Transforms (see, e.g., Kumar, et al., xe2x80x9cHeteroscedastic Discriminant Analysis and Reduced Rank HMMs For Improved Speech Recognition,xe2x80x9d Speech Communication, 26:283-297, 1998).
More specifically, FIG. 1a depicts one method for applying a linear transform to an observed event x. With this method, a precomputed nxc3x97n linear transformation, xcex8T, is multiplied by an observed event x (an nxc3x971 feature vector), to yield and nxc3x971 dimensional vector, y. The vector y is modeled as a gaussian vector with a mean uj and variances xcexa3j for each different class. The same y is used and a different mean and variance is assigned for each different class to model that same y. The variances for each class are assumed to be diagonal covariance matrices.
In another conventional method depicted in FIG. 1b, instead of a single linear transformation xcex8T (as in FIG. 1a), a plurality of linear transformation matrices xcex81T, xcex82T are implemented, as long as the value of the determinant is constrained to be xe2x80x9c1xe2x80x9d (unity). Then one transformation is applied for one set of classes, and other to another set of classes. With this method, each class may have its own linear transformation xcex8, or two or more classes may share the same linear transformation xcex8.
The present invention is directed to a system and method for applying a linear transformation to classify and input event. In one aspect, a method for classification comprises the steps of:
capturing an input event;
extracting an n-dimensional feature vector from the input event;
applying a linear transformation to the feature vector to generate a pool of projections;
utilizing different subsets from the pool of projections to classify the feature vector; and
outputting a class identity associated with the feature vector.
In another aspect, the step of utilizing different subsets from the pool of projections to classify the feature vector comprises the steps of:
for each predefined class, selecting a subset from the pool of projections associated with the class;
computing a score for the class based on the associated subset; and
assigning, to the feature vector, the class having the highest computed score.
In yet another aspect, each of the associated subsets comprise a unique predefined set of n indices computed during training, which are used to select the associated components from the computed pool of projections.
In another aspect, a preferred classification method is implemented in Gaussian and/or maximum-likelihood framework.
The novel concept of applying projections is different from the conventional method of applying different transformations because the sharing is at the level of the projections. Therefore, in principle, each class (or large number of classes) may use different xe2x80x9clinear transformsxe2x80x9d, although the difference between such transformations may arise from selecting a different combination of linear projections from a relatively small pool of projections. This concept of applying projections can advantageously be applied in the presence of any underlying classifier.
These and other aspects, features and advantage of the present invention will be described and become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.