Speech to text (STT) systems and methods that produce text output based on audio input are known in the art. To convert speech to text, STT systems use dictionaries. Dictionaries have a finite vocabulary size. One of the problems faced by STT systems is the balance between out-of-vocabulary (OOV) error rate, word error rate and performance of an STT system. Generally, using a large vocabulary may reduce the OOV rate but may also reduce system performance and increase error rate (e.g., more confusable words to choose from), while using a small set of words or vocabulary typically improves system performance but also increases the OOV error rate.