1. Technical Field
The present invention relates to classification of data and, in particular, to classification of incomplete data.
2. Description of the Related Art
A variety of applications employ classification techniques that contend with train and/or test instances that have missing or conflicting features. For example, a spam filter may be trained from data originating from servers storing different features. In addition, a churn predictor may deal with incomplete log features for new customers or a face detector might deal with images for which high resolution cues are corrupted. Further, features analyzed for classification purposes can be missing for a variety of other reasons. For example, applications may suffer from sensor failure or communication errors. Moreover, some features can be “structurally” missing in that the measurements are considered absent because they do not make sense. For example, the application may have log features for new customers. Additionally, applications may rely on different sources for training. Here, each source might not collect the exact same set of features, or might have introduced novel features during the data collection process. Accounting for such missing features is an important aspect of classification techniques and applications that apply them.