Machine learning is a form of self-calibration of predictive models that are built from training data and commonly used to find hidden value in big data. Facilitating effective decision making requires the transformation of relevant data to high-quality descriptive and predictive models. The transformation presents several challenges however. For example, a neural network type predictive model generates predicted outputs by transforming a set of inputs through a series of hidden layers that are defined by activation functions linked with weights. Determining the activation functions and the weights to determine the best model configuration is a complex optimization problem.
The activation functions and the weights, among other parameters, are referred to herein as “hyperparameters” that are defined by a user to control determination of a predictive model using various model types such as the neural network model type, a gradient boosting tree model type, a decision tree model type, a forest model type, and a support vector machine model type. Different hyperparameters are used based on the type of predictive model. Though the predictive model solutions are governed by the hyperparameters, there are typically no clear default values for the hyperparameters that generate a satisfactory predictive model for a wide range of applications. For example, a depth of a decision tree model type, a number of trees in a forest model type, a number of hidden layers and neurons in each layer in a neural network model type, and a degree of regularization to prevent overfitting are a few examples of quantities that are provided as inputs to train a predictive model. Not only do the input values used for the hyperparameters dictate the performance of the training process, but more importantly they govern the quality of the resulting predictive models.
The approach to finding the ideal values for hyperparameters (tuning a predictive model type to a particular dataset) has traditionally been a manual effort. For guidance in setting these values, researchers often rely on their past experience using these machine learning algorithms to train models. However, even with expertise in machine learning algorithms and their hyperparameters, the best values of these hyperparameters changes with different data. As a result, it is difficult to define the hyperparameter values based on previous experience. However, there is an inherent expense in training numerous candidate models to evaluate various values and combinations of values for the hyperparameters in terms of computing resources, computing time, and user time.