1. Field of the Invention
The present invention relates to the field of computer science and, more particularly, to natural language understanding and interpretation.
2. Description of the Related Art
The goal of a natural language processing application is to convert a language input used by humans into a language that a computer is capable of understanding so that humans can communicate with computers in the same manner in which humans communicate with each other. Conventional natural language processing applications generate a number of possible meanings or a “top-n” list of possible meanings for a language input along with a probability for each. Sometimes a language input provider, often a computer user, will be asked to confirm that a computer correctly interpreted a language input. The natural language processing application can also provide a selection mechanism by which the input provider can replace a predicted meaning, typically the possible meaning with the greatest probability, with a selected one of the possible meanings from the top-n list. Three common problems that cause natural language processing applications to provide incorrect predicted meanings include mumble, ambiguous input, and compound input.
Input can be considered mumble (junk input, out of domain input or rejected input) if the input is not capable of being interpreted correctly. That is, the input may be not well formed owing to grammatical, typographical, spelling, speech recognition, handwriting recognition, or other similar input recognition errors. Conventional natural language processing applications typically handle input misrecognition errors by validating words and phrases in the input using spelling and grammar checking tools. Mumble can also include input specified in unusual terminology or out of the domain of the natural language interpreter. That is, the language input can include numerous ramblings and digressions that can make an input difficult for a natural language processing application to properly interpret the meaning of the input. Conventional natural language processing applications generally handle rambling by limiting language input to a closed-set of possible interpretations, thereby limiting the capabilities of the natural language processing application. Left undetected, mumble can be a significant cause of incorrect natural language processing interpretations.
Ambiguous input is input that is capable of being interpreted in two or more possible ways. That is, ambiguous input can easily be mapped to two or more supported actions or possible interpretation categories. For example, an input to a banking system of “I need some money” can be an ambiguous input representing either a request for a loan or a request for a withdrawal from an existing account. The ambiguity inherent in the human language combined with the precision preferred for computers can make detection and correction of ambiguous input an extremely difficult proposition. Conventional natural language processing applications generally fail to detect ambiguous situations, which is one reason why conventional applications provide a top-n list of meanings and/or continuously prompt users to confirm interpretations.
Compound input can be input that is composed of a combination of two or more actions or interpretation categories within a single sentence or phrase. Correctly interpreting a compound input requires the natural language processing application to identify two or more different meanings from a single language input and join them into a single combined interpretation. For example, an input to a banking system of “I need help on withdrawing money from my account” can be a compound input. One valid meaning for the exemplary input can include a request for help. A second valid meaning for the exemplary input can include a request to withdraw money. Consequently, a combined compound interpretation of obtaining help on withdrawing money can be desired based upon the single language input. Conventional natural language processing applications generally interpret compound input as an input having just one meaning instead of a combination of two meanings.