Embedded noise robust automatic speech recognition (ASR) systems need to conserve memory due to the small size and limited resources of devices such as cell phones, car navigation, digital TVs, and home appliances. However, ASR systems are notorious for consuming large amounts of computational resources, including Random Access Memory (RAM). This tendency of ASR systems can be especially problematic in embedded devices that also need to allocate such resources for other functions that often need to run concurrently with ASR functions. Yet, reducing the amount of memory consumed by a noise robust ASR heavily impacts recognition accuracy and/or robustness to noise.
Referring to FIG. 1, model domain methods try to improve the performance of pattern matching by modifying the acoustic models so that they are adapted to the current noise level, while leaving the input signal 100 unchanged. In particular, a noise estimation module 104 estimates noise in the input signal 100, and model compensation module 106 adjusts the acoustic models 108 based on these noise estimates. Then, extracted features obtained from the unmodified input signal 100 by feature extraction module 102 are pattern matched to the adjusted acoustic models 108 by pattern matching module 110 to achieve recognition 112.
What is needed is a way to reduce the memory requirements of embedded noise robust ASR systems with reduced impact on recognition accuracy and/or robustness to noise. The present invention fulfills this need by making several changes to a noise robustness system employing a model domain method.