1. Technical Field
The present disclosure relates to automatic speech recognition and, in particular, to automatic speech recognition across different applications or environments.
2. Introduction
Over the past 5 decades, researchers and developers have been creating tools and algorithms to enable rapid development of acoustic and language models to support domain-specific speech recognition applications. These applications rely on speech recognition models. Often, a generic speech model is used to recognize speech from multiple users. Similarly, current systems capable of performing speech recognition across different applications or environments rely on generic speech models. Given that speech recognizers depend significantly on the distribution of words and phrases, such systems typically fail as they attempt to provide generality while lowering performance.
Moreover, these systems require tremendous costs to develop. For example, a team of 3-6 people may take 3-6 months to develop a single speech application. In addition, known models for performing speech recognition across different applications or environments perforce require a high volume of data. Disadvantageously, these systems are created by combining all potential data available into a single system. The increased volume of data requires intensive processing and causes out of memory problems. As a result, these systems are costly and hard to scale.