Statistical language models (SLM) capture regularities of natural language for the purpose of improving the performance of various Natural Language Understanding (NLU) systems. The SLM attempts to provide a probability distribution of various linguistic units such as words, sentences, and the like. Some varieties of SLMs utilize embedded grammars to help process speech-recognized text. An embedded grammar, in general, is a grammar that is referenced and used by the SLM in processing speech-recognized text. Embedded grammars typically are crafted for a limited purpose, i.e., identifying numbers, cities, restaurants, etc.
In some cases, embedded grammars within an NLU system overlap with one another. That is, one or more of the embedded grammars specify one or more of the same values. In illustration, an embedded grammar for states can be said to be “overlapping” with an embedded grammar for cities if both include the value, or member, “New York”. Another example is where two or more grammars include numerical values. An embedded grammar for temperatures, for instance, will likely include one or more of the same numerical values as an embedded grammar for humidity level.
Output from the SLM, which typically includes information such as the speech-recognized text and references to particular embedded grammars, can be provided to an action classifier. The action classifier creates a list of one or more possible tasks, or actions, based upon the received input. The actions or tasks represent an interpretation of what it is that the user wishes to do. In determining the action, the action classifier is biased, at least in part, according to the list of embedded grammars received from the SLM with the speech-recognized text.
Consider the sentence “make temperature 50 degrees” and the sentence “set humidity at 50 percent”. The SLM would provide the following as output for each respective sentence “make temperature TEMPERATURE” and “set humidity at HUMIDITY percent”. In this case, the SLM provides the correct disambiguation between TEMPERATURE and HUMIDITY for the word “50”. The search tree of the SLM typically has a depth of 2 or 3 words. This means that the particular words that cause the SLM to disambiguate to the correct embedded grammar must be 1 or 2 words away from the target word, which is “50” in this case. Accordingly, tokens such as “temperature” and “humidity” are identified which allow the SLM to determine the proper context for each embedded grammar.
If the surrounding word context is not sufficient, the SLM may incorrectly disambiguate the overlapping embedded grammars. Consider the sentence “set the temperature value to 50”, which, when processed, can result in an ambiguous situation. The surrounding 3 word context “value to 50” does not help the SLM in determining whether “50” should be resolved to a TEMPERATURE or a HUMIDITY embedded grammar. A similar situation arises in the sentence “set humidity level to 50”. The surrounding 3 word context “level to 50” does not help to disambiguate whether “50” is a TEMPERATURE or HUMIDITY.
If the SLM makes an error in disambiguating the name of an embedded grammar, there is a reasonable probability that the action classifier will also produce an incorrect task or list of tasks. For example, the sentence “set the temperature value to 50” may be incorrectly interpreted by the SLM so that “50” is labeled HUMIDITY. The input to the action classifier would be, in that case, “set the temperature value to HUMIDITY”, where the word “HUMIDITY” would bias the action classifier significantly toward an incorrect action dealing with humidity, such as “SET_HUMIDITY” rather than the correct action “SET_TEMPERATURE”. Thus, in some cases, the incorrect embedded grammar can bias the action classifier toward an incorrect result.