1. Field of the Invention
The present invention relates to the field of speech recognition systems and, more particularly, to a solution that integrates voice enrollment with other recognition grammars using a layered grammar stack.
2. Description of the Related Art
Voice enrollment permits users to dynamically add phrases to a voice enrolled grammar at runtime. A user provided, enrolled phrase is typically added to a special dynamic grammar (e.g., a voice enrolled grammar) which may contain other user provided phrases. During enrollment, a user typically is required to repeat a new phrase multiple times until a level of consistency is achieved. Voice enrollment is generally handled by a separate component/entity/process from that used for other (i.e., non-enrollment) types of speech recognition. For example, voice enrollment functionality provided by a speech processing system is often accessed via a special application program interface (API) that provides access to voice enrollment functionality.
In other words, a conventionally implemented speech processing system supporting voice enrollment has a normal decoding path and a separate path for voice enrollment. This results in additional overhead for voice enrollment specific functions and also results in a potential for differing recognition results. For example, a voice enrollment component/entity/process can return one result set and a general speech recognition component/entity/process can return a different result set for the same input. Differing recognition results are presently handled using phonetic distance tables, which are used to perform phonetic comparison for similarity and confusability. Because of the added overhead involved for supporting voice enrollment in a conventional speech recognition system, performance can degrade and/or relatively large quantities of computing resources can be consumed in the process of enrolling user phrases.