Advances in machine learning have enabled computing devices to solve complex problems in many fields. For example, image analysis (e.g., face recognition), natural language processing, and many other fields have benefitted from the use of machine learning techniques.
In some machine learning techniques, supervised training data (e.g., data with a known characteristic, also referred to as a label) is provided as input to train a machine learning process to generate a data model. The data model may take one of many forms, such as a decision tree, a neural network, or a support vector machine. For example, to train or optimize a neural network using back propagation, data is provided as input to the neural network, and an output of the neural network is compared to a label associated with the data. A difference between the output of the neural network and the label is used to calculate an error function, which is used to modify the neural network (e.g., by changing link weights between nodes of the neural network) with a goal of decreasing the difference between the output of the neural network based on particular data and the label associated with the particular data. Accordingly, after the neural network is trained, the neural network may be expected to generate reliable results based on the supervised training data.
However, neural networks and other types of data models generally do not describe a human-understandable relationship between the input data and the output data. Stated another way, it is generally not clear, from a human perspective, why a neural network would be expected to produce a reliable result. Accordingly, there is sometimes concern about the reliability of machine learning data models, since a human viewing a machine learning data model may be unable to discern a pattern or logical reason why the data model generates a particular output based on a particular input.