Machine-learning technology is an important tool for dealing with large amounts of data. Such technology enables the construction of systems that can learn from a particular data set and, based on that learning, perform accurately on new, unseen data. Machine-learned models include classification models, such as binary classification models and multi-class classification models, entity extraction models, and ranking models. A binary classifier, for example, classifies items of data into one of two classes. A multi-class classifier is similar to a binary classifier, but instead of classifying items of data into one of two classes, the multi-class classifier classifies items of data into one of several classes. To accomplish this, the classifier is provided a set of training data, where each item of training data is labeled, either automatically or manually by a human operator, as belonging to one of the several classes. The classifier learns from this labeled training data, and then, based on its learning, predicts which class items belong to by assigning a score for each class to each item. For each item evaluated, a probability score may be calculated for each available class. The score reflects a probability, as assessed by the classifier, that the item belongs to a particular class. Thus, the score indicates a confidence level associated with the classifier's prediction.
An entity extraction model locates and classifies items of data into predefined categories, such as locating and classifying the names of people in a textual document. A ranking model assigns a score to a set of items of data for the purpose of sorting those items, such as a model used to rank search results in a web page search engine. In order to improve and refine any of these or other machine-learned models, it is important that a user be able to assess how well the machine-learned model is performing.