The present invention, in some embodiments thereof, relates to machine learning and, more specifically, but not exclusively, to systems and methods for selection of features for classification and/or prediction using machine learning.
The process of machine learning includes selection of methods that learn from exiting data to classify and/or predict new data. A set of training data representing a spectrum of examples, which are optionally labeled, is provided. Features are extracted from each member of the set of training data. The features, along with the labeling, are used to train a machine learning method, for example, a statistical classifier, to classify and/or predict new unseen data, based on the assumption that the unseen data is based on a distribution similar to that of the training set.
In order to try and obtain accurate prediction, data scientists invest considerable time and effort in the manual design and construction of the features for each classification and/or prediction problem, for example, financial forecasting.