1. Field of the Invention
The present invention generally relates to searching data using ensembles of models, and more particularly to the use of sub-ensembles that include a smaller number of models than the ensemble and that include only the most accurate models to increase throughput without sacrificing accuracy.
2. Description of the Related Art
In the past few years, multiple models or ensembles has been extensively studied in data mining to scale up or speed up learning a single model from a very large dataset. There are various forms of ensembles that have been proposed. However, multiple models have one intrinsic problem, i.e., inefficiency in classification. In order to make a prediction on an example, conventionally every model in the ensemble needs to be consulted. This significantly reduces prediction throughput. The invention described below addresses these needs.