The present invention relates in general to a system or method (collectively “selection system”) for selecting robust attributes for use in a classifier from a pool of attributes that could potentially be processed by the classifier. In particular, the present invention relates to a selection system that selects attributes on the basis of statistical distributions.
Classifiers are devices that generate classifications out of sensor data collected by one or more sensors. Classification determinations are based on attribute values that associated with attribute types within the sensor data. For example, in a digital picture of a table, the height of the table is an attribute type. Accordingly the numerical value associated with attribute type is the attribute value. In the context of a “height” attribute type, the attribute value could be the number of pixels from top to bottom, or a measurement such as inches, feet, yards, or meters. Attribute values and attribute types are the means by which classifiers generate classifications, and each type of sensor is capable of capturing a potentially voluminous number of attribute types.
Classifications can be generated in a wide variety of different formats, and for a wide variety of different purposes. For example, a classifier in an airbag deployment mechanism could be used to identify the location of the upper torso of the occupant so that the airbag deployment mechanism can track the location of the occupant, an ability useful in making airbag deployment decisions. Another example of a classifier could be in conjunction with an automated forklift sensor, with the sensor system distinguishing between different potential obstacles, such as other forklifts, pedestrians, cargo, and other forms of objects.
In many of the voluminous number of diverse embodiments and contexts of classifiers, classifiers suffer from what can be referred to as the “curse of dimensionality.” As different attributes are incorporated into the determination process of a classier, the accuracy of the classifier typically degrades rather than improves. This is in sharp contrast to the way human beings typically function, because humans tend to make better decisions when more information is available. It would be desirable for a selection system to identify a subset of robust attribute types from a pool of potential attribute types. This can preferably be done through the use of actual test data.
It would be desirable for non-robust features to be filtered out so that the accuracy of the classifier is enhanced, and not minimized. By utilizing fewer attribute types, performance can be increased while reducing cost at the same time. Prior art processes for selecting attributes rely either on attribute-to-attribute correlation measures, or by measures such as entropy. It would be desirable if statistical distributions in the processing of the features were used to eliminate redundant attribute types, and select the desired attribute types. Instead of merely calculating the covariance of data point pairs, it would be desirable to evaluate whether different attribute values are from the same underlying distribution.
Such a method of feature selection would be particularly advantageous with respect to classifiers in airbag deployment mechanisms. Frequent movement within the seat area coupled with the high variability of human clothing and appearance requires a better attribute selection process.