Ensembles of decision trees, often called decision forests, are used in machine learning of classification and regression tasks such as for body joint position detection in natural user interface systems, for automated organ detection and recognition in medical imaging systems and for many other applications.
Existing approaches often use decision forests which are trained using labeled (or sometimes unlabeled) training data during a training phase. During training the structure of each tree in the decision forest is learnt (that is, how many nodes there are in the tree and how these are connected to one another) and also tests (also referred to as predicates) are selected for split nodes of each tree. In addition, data accumulates at the leaves of the trees in each forest during the training phase. A trained decision forest may be used during a test phase, where data previously unknown to the forest is passed through the forest to obtain an output. The output may be a predicted joint position in the case of a body joint position detection task or a probability of a particular body organ in the case of an organ detection and recognition system. Quality of the output may be related to how well the trained decision forest is able to generalize. That is how good the predictions are that it produces for data which is different from the data used during training. Training decision forests affects ability of the resulting trained forest to generalize. Training may be time consuming, complex, resource intensive and hard to achieve in application domains where little training data is available.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known decision tree training systems for machine learning.