The present invention relates to linear discriminant analysis and, more particularly, relates to weighted pair-wise scatter to improve linear discriminant analysis.
Feature vectors are often used in pattern recognition, and are mathematical ways of describing features about a pattern. For instance, speech data is commonly processed into speech feature vectors, which are then analyzed. Handwriting analysis and computer visual processing are other pattern recognition processes where feature vectors can be used.
For example, assume that a computer system is being programmed to visually distinguish between fruits. If the set of fruits is {banana, grapefruit, apple, watermelon}, the computer system could be programmed to examine the color, shape, and surface texture of the fruits. As a feature vector, this would be described as vectors having the elements of (color, shape, surface). For instance, a banana could have the feature vector (yellow, curved, smooth), while a grapefruit could have the feature vector of (red-orange, round, rough). Similarly, the feature vector for an apple could be (red, round, shiny), and the feature vector for a watermelon could be (dark green, oblong, smooth). Generally, the elements of each feature vector would be quantized so that a computer system can compare the feature vectors. Thus, the feature vector (yellow, curved, smooth) could be (xe2x88x923, 0, 0), while the feature vector (red-orange, round, rough) could be (0, 1.5, 1).
One use for a feature vector is to determine into which class a sample feature vector falls. If the computer system is shown an apple that is slightly oblong, the system should be able to determine that the apple falls into the class that denotes apples. For instance, referring to FIG. 1, a three-dimensional class space 100 is shown. This three-dimensional class space 100 can be used by a system to determine if an unknown feature vector belongs to one of the classes. Three-dimensional space 100 comprises class 110 (corresponding to a banana), class 120 (corresponding to a grapefruit), class 130 (corresponding to an apple), and class 140 (corresponding to a watermelon). In this simplistic representation, the X axis corresponds to color, the Y axis to shape, and the Z axis to surface texture. A computer system could use this space to determine whether an unknown feature vector, such as unknown feature vector 150, belongs to one of the classes. As can be seen in FIG. 1, unknown feature vector 150 is closest to class 130, and thus is likely an apple.
Because each apple is slightly different than other apples, and each grapefruit is slightly different from other grapefruits, systems like this are xe2x80x9ctrainedxe2x80x9d by showing the system a number of bananas, grapefruit, apples, and watermelons. These data are used to develop classes, and the classes are generally not single points as shown in FIG. 1. Instead, the classes can be thought of as volumes and are usually shown through reference to means. Thus, classes 110 through 140 in FIG. 1 are class means 110 through 140, where each mean corresponds to a class. Determining classes can be quite complex, but it can be assumed for the purposes of FIG. 1 that classes can be determined.
Classes 120 and 130 are considered to be xe2x80x9cconfusablexe2x80x9d because it is harder to determine into which class an unknown feature vector should be placed. In the example of FIG. 1, it is relatively easy to determine that unknown feature vector 150 belongs to class 130. Moving the unknown feature vector 150 just toward the X and Y axes could make it very hard to determine into which class 120 or 130 unknown feature vector 150 belongs.
While three-dimensional class space 100 is useful for simple feature vectors, additional processing is usually performed for feature vectors in many applications. This occurs because feature vectors can be quite large. For example, speech feature vectors commonly contain many elements.
One way of dealing with such large vectors is to reduce the dimensions of the feature vectors and process the reduced-dimension feature vectors. A common technique that does this is Linear Discriminant Analysis (LDA), which reduces the dimensions of the feature vectors while maintaining maximal discrimination. This has the benefits of providing reduced-dimension feature vectors while still allowing proper discrimination between feature vectors. This can have the effect of filtering out the xe2x80x9cnoisexe2x80x9d features while still retaining the discriminative features. In the example of FIG. 1, color and shape are features that are highly discriminative of fruits, while texture is less discriminative. The process of LDA attempts to retain a high amount of discriminant information while reducing dimensions.
An exemplary reduced-dimension class space 200 is shown in FIG. 2. In FIG. 2, the class means 110 through 140 and unknown feature vector 150 have been reduced from three dimensions to two dimensions. A problem with current LDA is illustrated in FIG. 2, where classes 120 and 130 have been placed almost on top of each other, making it hard to determine into which class unknown feature vectors belong. In FIG. 1, it was easy to determine that unknown feature vector 150 belongs to class 130. In FIG. 2, however, it is unclear as to which class the unknown feature vector 150 belongs. Current LDA therefore can make confusable classes even more confusable in reduced-dimensional class space.
Thus, what is needed is a better way of performing LDA that overcomes the problem of increasing confusability of classes during a transformation of feature vectors to reduced-dimensional class space.
The present invention provides weighted pair-wise scatter to improve Linear Discriminant Analysis (LDA). This decreases confusability in reduced-dimensional class space, which increases discrimination and, thereby, increases the probability that a sample feature vector will be correctly associated with an appropriate class.
In general, the present invention determines and applies weights for class pairs. The weights are selected to better separate, in reduced-dimensional class space, the classes that are more confusable in normal-dimensional class space. During the dimension-reducing process, higher weights are preferably assigned to more confusable class pairs while lower weights are assigned to less confusable class pairs. As compared to unweighted LDA, the present invention will result in decreased confusability of class pairs in reduced-dimensional class space.
A more complete understanding of the present invention, as well as further features and advantages of the present invention, will be obtained by reference to the following detailed description and drawings.