The present invention relates to the construction of grammars used in speech recognition. In particular, the invention relates to the construction of grammars in a binary format.
In speech recognition systems, a computer system attempts to identify a sequence of words from a speech signal. One way to improve the accuracy of the recognition is to limit the recognition to a set of selected phrases. This is typically done by limiting valid recognition hypothesis to phrases that are found in a context-free grammar (CFG).
One common method for describing phrases in a context-free grammar is to use a Recursive Transition Network description. Under such RTNs, each word in a phrase is represented by a transition between two states. Multiple transitions can extend from a single phrase, allowing multiple phrases to be represented by a single RTN structure. For example, the phrase “go back” and the phrase “go forward” can be represented by a single RTN structure with a first transition extending between a first state and a second state to represent the word “go” and two parallel transitions extending between the second state and a third state to represent the words “back” and “forward”, respectively.
In the past, the binary version of the context-free grammar included a description of the RTN structures that explicitly recited each state and each transition. Since each description of a state or transition requires some amount of memory, each description adds to the size of the binary grammar.
In addition, binary grammars of the past generated records for each transition that included both the transition's position in the structure and the actual word or semantic tag associated with the transition. Because the words and tags are of variable lengths, prior art grammars either had to make the records a fixed size that was large enough to accommodate all possible words, or a variable size. If the records were made a fixed size, almost all of the transition records would include unused space making the binary grammar wastefully large. If variable length records are used, parsing the grammar to retrieve information becomes very difficult because the grammar must be searched to find the right record.
There is also a need for a binary grammar that includes several different types of records that reference one another in such a way that if the binary grammar were loaded into memory, the references could be used directly to retrieve desired information without having to first resolve one or more pointers.