Machine learning (ML) techniques utilize computational and statistical methods to automatically extract information from data. These techniques allow computers to “learn.” Recently, human-computer interaction (HCI) researchers have taken increasing interest in building classifiers using machine learning for their applied value. For example, using applied machine learning techniques, HCI researchers can disambiguate and interpret noisy streams of data and develop novel input modalities. In addition, these researchers can analyze complex patterns in data to perform predictions or diagnoses, or infer user intent to optimally adapt interfaces to assist users.
As machine learning becomes more widely used for its applied value, both within the HCI community and beyond, a critical challenge is to provide adequate tools to allow non-experts to effectively utilize ML techniques. Researchers who are non-experts in machine learning techniques frequently use these in multiple-class (or multi-class) classification problems. For such problems, the goal is to develop an algorithm (such as a “classifier” or a “model”) that will assign input data to one of a discrete number of classes. While standard classification considers problems where there are two possible classes, multi-class classification permits any number of classes. One example of a multi-class classification problem is classifying handwritten characters as one of the 26 letters of the alphabet. In general, the multi-class classification problem is considered a much more challenging problem in machine learning than the binary classification problem.
At present, a common applied machine learning workflow is to iteratively develop classifiers by refining low-level choices such as feature selection, algorithm selection, and parameter tuning. This is usually performed using manual trial and error in an approach that represents hill-climbing through the model space. Models are compared using accuracy or other coarse summaries. However, these simple summaries provide little direction on how to improve the classifier. This problem is exacerbated in multi-class classification problems where single value summaries, such as accuracy, can be quite misleading.
To address this issue, previous research has focused on developing tools for better explaining a single classifier's behavior (or misbehavior). However, this workflow is often sub-optimal. In particular, this workflow obscures the beneficial dependencies and complementarities that exist between classifiers and discards the context of the space of possible models. Moreover, it often leads to poor local maxima, and the large work in generating the series of models is often wasted as there are no means to compare and evaluate the trajectory of models being learned.
Previous work has noted the importance of human involvement to provide training data and has proposed an interactive machine learning model that allows users to train, classify, and correct classifications in a continuously iterative loop. In a recent study, the current use of machine learning by non-expert researchers was evaluated and three difficulties were identified: (1) applying an iterative exploration process; (2) understanding the machine learning models; and (3) evaluating performance. The conclusion of the study was to create a library of models known to work well for a variety of common problems. Such a library could be the source of an initial set of models.
Researchers also have looked into the general problem of combining decisions from multiple classifiers. Simple rules such as majority vote, sum, product, maximum and minimum of the classifier outputs are popular and often produce results better than individual classification system. One problem, however, with these fixed rules is that it is difficult to predict which rule will perform best. At the other end of the spectrum are critic-driven approaches, such as layered hidden Markov models (HMMs), where the goal is to “learn” a good combination scheme using a hierarchy of classifiers. One disadvantage with these critic-driven approaches is that they require a large amount of labeled training data, which often is prohibitive for HCI work.
In order to aid machine learning development, researchers have explored various ways to visualize specific machine learning algorithms, including naïve-Bayes, decision trees, support vector machines (SVMs), and HMMs. One study has shown that such tools can produce better classifiers than automatic techniques. However, since these visualizations and interaction techniques are tied to specific algorithms, they do not support comparisons across algorithm types.
More general visualization techniques that apply across algorithm types include receiver operating characteristic (ROC) and Cost curves, which support evaluation of model performance as a function of misclassification costs. These visualization techniques are commonly used in the machine learning community. For practical purposes, however, they are restricted to binary classification tasks and they do not directly support iterative improvement of a classifier.
One popular visualization technique in machine learning is to plot the data instances in some projection of feature space and visualize model prediction boundaries. This approach generalizes across algorithms at the expense of not showing the internal operation of the particular algorithm. One approach uses visualizations of summary statistics derived from a large set of decision trees generated by boosting. Another approach visually compares a small set of decision trees produced by boosting.
It has been shown that matrix visualization can be used to exhibit high-level structures by reordering the rows and columns of the matrix. Many researchers have attempted to exploit the benefits of matrix reordering. For example, one approach provides an overview using a permutation matrix to enable users to easily identify relationships among sets with a large number of elements. Another approach uses reordered matrix-based representation in addition to node-link diagrams to support social network analysis. Yet another approach uses a shaded similarity matrix with reordering to visualize clusters in classification. However, these approaches are designed for visualizing only the similarity of data instances.