Automatic speech recognition technology typically utilizes a corpus to translate speech data into text data. A corpus is a database of speech audio files and text transcriptions of the audio files in a format that can be used to form acoustic models. One way to improve an acoustic model is to provide a large corpus. Conventionally, however, very large amounts of correctly transcribed audio data are not available, or may be very expensive to produce. Large amounts of transcribed audio are available, for example, in the form of close-captioning for television programs. However, these sources usually contain errors. Use of these imperfect transcription corpuses can lead to suboptimal acoustic models. Consequently, techniques to permit the reliable use of imperfect transcription sources are desirable. It is with respect to these and other considerations that the present improvements have been needed.