1. Technical Field
The present disclosure relates to generating speech, acoustic and/or language models and more specifically to streamlining model development for fast turnaround and minimal human involvement.
2. Introduction
In the automatic speech recognition (ASR) industry, the process of generating and tuning speech models, acoustic models, and/or language models is very labor intensive and time consuming. The process requires the efforts of many individuals to perform the various steps and iterations. Not only must humans perform large portions of the work, the process requires human decision making at several steps along the way of this process. From start to finish, generating and tuning a speech model, for example, can take many days or weeks. The lengthy turnaround time and significant human involvement impose significant cost on the development of new speech recognition systems and prevents the rapid deployment of new systems.