Vocal utterances may include a hotword, i.e., a predetermined reserved word that causes a system to perform a corresponding action or actions. A speech recognition service on an electronic device generally receives the vocal utterances that include spoken words from a user, and transcribes the spoken words into text. To accomplish this, the speech recognition service may attempt to match the sounds of the spoken input with phonetic representations of textual words.