Many organizations and individuals use electronic data to improve their operations or aid their decision-making. For example, many business enterprises use data management technologies to enhance the efficiency of various business processes, such as executing transactions, tracking inputs and outputs, or marketing products. As another example, many businesses use operational data to evaluate performance of business processes, to measure the effectiveness of efforts to improve processes, or to decide how to adjust processes.
In some cases, electronic data can be used to anticipate problems or opportunities. Some organizations combine operations data describing what happened in the past with evaluation data describing subsequent values of performance metrics to build predictive models. Based on the outcomes predicted by the predictive models, organizations can make decisions, adjust processes, or take other actions. For example, an insurance company might seek to build a predictive model that more accurately forecasts future claims, or a predictive model that predicts when policyholders are considering switching to competing insurers. An automobile manufacturer might seek to build a predictive model that more accurately forecasts demand for new car models. A fire department might seek to build a predictive model that forecasts days with high fire danger, or predicts which structures are endangered by a fire.
Machine-learning techniques (e.g., supervised statistical-learning techniques) may be used to generate a predictive model from a dataset that includes previously recorded observations of at least two variables. The variable(s) to be predicted may be referred to as “target(s)”, “response(s)”, or “dependent variable(s)”. The remaining variable(s), which can be used to make the predictions, may be referred to as “feature(s)”, “predictor(s)”, or “independent variable(s)”. The observations are generally partitioned into at least one “training” dataset and at least one “test” dataset. A data analyst then selects a statistical-learning procedure and executes that procedure on the training dataset to generate a predictive model. The analyst then tests the generated model on the test dataset to determine how well the model predicts the value(s) of the target(s), relative to actual observations of the target(s).