Classification algorithms are used in a wide variety of applications for performing tasks that involve identifying a class to which a previously unseen item belongs. For example, classification algorithms can be used to process images, such that the pixels of the image are classified according to what they represent. An example of this is face detection, which can be used to control the focusing or operation of a digital camera. In this example, the pixels are classified as to whether or not they are part of a face in the image. Other examples include medical image analysis, email spam detection, document classification, and speech recognition.
Many classification algorithms are based on machine learning concepts, and are trained to perform a dedicated classification task. Linear classifiers are one of the most popular classification techniques. For binary classification tasks (i.e. those only having a positive or negative classification output, e.g. whether or not a face is present), the decision function ƒ: X→{−1,1} takes the form:
                              F          ⁡                      (                                          x                ;                w                            ,              f                        )                          =                  sign          (                                    ∑                              t                =                1                            m                        ⁢                                                  ⁢                                          w                t                            ⁢                                                f                  t                                ⁡                                  (                  x                  )                                                              )                                    (        1        )            
Where wtε R are learned weights, ƒt:X→R are arbitrary feature functions evaluated on the instance x, and m is the number of features to evaluate. To use such a classifier, the feature functions ƒt and weights wt are learned using a set of training instances (i.e. examples for which the classification is pre-known). Once the feature functions ƒt and weights wt are learned, the algorithm can be used on previously unseen instances by calculating the cumulative sum of the weighted feature functions over all features. The sign (positive or negative) of the cumulative sum indicates whether the classification is positive or negative.
The above form of classifier in eqn. (1) includes many popular classifiers such as linear support vector machines and logistic regression classifiers, boosting, and kernelized support vector machines. Although the methods differ in what feature functions and weight vectors are used, for all methods the test-time evaluation consists of evaluating an equation of the above form.
In order to achieve high generalization performance, a large number m of features are often utilized. For example, in the case of Boosting thousands of weak learners, each defining a new feature, are selected, whereas for non-parametric methods such as Kernelized SVMs the number of features ƒt(•)=k(•,xt) grows with the number of training instances. In general, evaluating the above form of classifier for a single instance has a complexity that is linear in m.
However, achieving low test-time (also called runtime or evaluation time) complexity is an key design goal for practical applications. This has led to a methods aimed at producing a classifier with having low evaluation complexity. An example of such a method is called “feature selection”. Feature selection aims to choose a subset of the features that perform well, and use this reduced subset of features at test-time. However, whilst this improves the test-time complexity, it does this at the expense of accuracy of the classifier.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known classification algorithms.