Much of the machine learning (ML) research has focused on building more accurate or more efficient models. This research usually accesses the utility of a model based on the performance of the model. The quality of a model depends on several factors, including the correctness or trustworthiness of the training corpus, the discriminative value of the encoded features, and whether additional features or attributes can be encoded that further improve predictive performance. A model that has been trained using mislabeled data or data that has been encoded with non-descriptive features cannot be expected to be accurate. Automatically determining the characteristics of the data can be exceptionally difficult. However, with proper views on the data, human interaction based on domain knowledge and higher order reasoning can be used to interpret results and to build higher-quality models.
Studies have shown that as a rule ML practitioners have trouble understanding relationships between data, attributes and models. As a result, usually there is usually little guidance on how to improve predictive performance. In fact, studies have shown that when their models do not work ML practitioners often spend an inordinate amount of time optimizing their classification algorithm rather than checking the quality of their data and features.
The use of ensemble methods has become quite popular and there is a large body of work on combining results from multiple models. Simple rules such as majority vote, sum, product, maximum and minimum of classifier outputs have been successfully used and often produce results that are an improvement over a single model. Other useful ensemble techniques include critic-driven models. For example, some techniques automatically generate simple models and combine them to build more accurate ensemble models. However, the majority of the work in this space is aimed at learning ensembles to increase accuracy.
Interactive machine learning systems are built with the viewpoint that humans and machines have complementary strengths. By creating a synergistic relationship between machines and humans, interactive machine learning systems can be used to train higher-quality models. A number of systems have followed this paradigm, but again the primary goal of this line of research has focused directly on generating higher-quality models. However, there may be times when a higher-quality model is not enough and tools that help a user to explore the data and to gain a deeper understanding of the data are useful.
For example, programmers incorporating machine learning algorithms (or models) into their code often like to explore how well the models classify different data and which features might impact the models that are build. Too often, however, programmers vary the parameters for one specific model and do not thoroughly explore the space of algorithms, data, and features. A typical machine learning formulation first extracts features from labeled data and then subsequently splits the labeled data into a training and a testing set. The training set is used to create a model, and the testing set is used to evaluate the model. While there are techniques that combine the results of several classifiers into a single joint classifier, there is little or no work that attempts to combine hundreds of classifiers and visualize the results so that users can interpret the results.