Automatic speech recognition (ASR) technology can be used to map audio utterances to textual representations of those utterances. In some systems, ASR involves comparing characteristics of the audio utterances to an acoustic model of human voice. However, different speakers may exhibit different speech characteristics (e.g., pitch, accent, tempo, etc.). Consequently, the acoustic model may not perform well for all speakers.