Automatic speech recognition (ASR) allows a computing device to understand human speech. Understanding human speech enables voice-to-text transcription and voice commands, among other functions. In real world situations, especially when far-field microphones are used, overlapping speech from multiple speakers (i.e., people talking simultaneously) or speech mixed with noise or music can decrease speech recognition accuracies. This is because, when a far-field microphone is used, the energy of the competing speakers, when measured by the microphone, can be very close to the energy of the speech from the target speaker. This is very different from the close-talk microphone scenario where the target speaker speaks much closer to the microphone than other speakers. In the close-talk microphone scenario, the target to masking ratio (TMR) is typically quite high and high fidelity can be kept in the captured speech. Automatic speech recognition (ASR) can use language models for determining plausible word sequences for a given language using the result of an audio processing as input.