An automated speech recognizer may be trained using training data that includes audio recordings of utterances and corresponding transcripts of the utterances. For example, training data may include an audio recording of “AUTOMATION” and a corresponding transcription of “AUTOMATION.” The quality of recognition by a trained automated speech recognizer may depend on the quality of the training data used to train the automated speech recognizer. For example, an automated speech recognizer that is trained using training data with incorrect transcriptions or biases may produce similarly incorrect transcriptions or biased transcriptions.