This invention relates to speech recognition and more particularly to minimization of a search network.
Continuous speech recognition is a resource-intensive algorithm. Commercial dictation software requires more than 10M bytes to install on the disk and 32M bytes RAM to run the application. Static memory is required to store the program (algorithm), grammar, dictionary, and acoustic models. The data will not change, therefore can be stored in disk or ROM. Dynamic memory is required to run the search. Search involves parsing the input speech and building a dynamically changing search tree, therefore, RAM is required for both Read and Write capabilities. The target application for this patent disclosure is for small vocabulary speech recognition implemented on, for example, fixed-point DSP chips. On a DSP implementation, RAM size is a critical cost factor because RAM occupies a large percentage of the total die size. ROM, on the other hand, is much cheaper in silicon real estate.
In accordance with one embodiment of the present invention, in the search network expansion in speech recognition only the current slot is maintained as long as we know what model and what state the slot is at and the previous slots can be discarded.