1. Field of the Invention
The present invention relates to data processing and speech signal processing. More particularly, this invention relates to methods, systems and algorithms that provide a computer interface for converting users' natural language messages, that are entered into the interface via speech and handwriting recognition software or by the use of a keyboard, into a query or command that can be processed by a computer.
2. Description of the Related Art
The need for and advantages of an improved interface with a computer have long been recognized. Thus, many complex methods and algorithms for processing natural language have been developed.
There are two primary approaches being used today in natural language processing. The approach with the longest history is “parsing techniques.” In this approach, a user's speech input is processed in a manner very similar to the method of making sentence diagrams that are often learned in grammar school. That is, the input is scanned for a possible subject, verb, direct object, etc. in an attempt to fit the input against a recognized structure in a rule-based table.
Natural language processors using this approach are characteristically very large and require a great deal of processing power to operate. Their strength is that they allow for the identification of unknown objects in an input. For example, in the user input “Get me the phone number for Bob.” A parsing system can very efficiently identify the action (“get”), the directionality (implied by “me”), the object (“the phone number”), and the preposition (“for”). This leaves the noun “Bob” which a parsing system can easily use as an argument or parameter in searching a database without ever having seen this noun within the parsing tables. This is possible because parsing systems have the ability to eliminate all things understood through their extensive processing.
Unfortunately, the greatest weakness of parsing systems is their inefficient accommodation of users' variability of expression. There are literally thousands of ways that a user might ask for a phone number, and parsing systems are not very efficient, or even capable of accepting different users unique expressions of a concept. As such systems accommodate more and more various ways that a concept might be expressed, they increase markedly in size and in processing power required for their operation. That any such system could accommodate the user input “Would you go ahead and get me the phone number for Bob from my contact list?” is highly unlikely. Moreover, because parsing systems are very much rule-based, it is difficult for such systems to process ungrammatical or incomplete messages.
Today, many companies offering speech recognition technology attempt to circumvent the “variability of expression” problem by employing subsets of language. In one approach, spoken input is analyzed for keywords. Keywords are then matched with application functionality. In another approach, application users are expected to learn a set of specific commands that are matched with application functionality.
Systems that analyze keywords typically operate with a much higher degree of understanding than parsing systems. For example, in this keyword approach, assume the user inputs “Would you go ahead and get me the phone number for Bob from my contact list?” the system ignores all words except “get”, “phone number”, “Bob”, and “contact list”.
This approach provides a very efficient system for skirting the “variability of expression” problem, but it does have serious limitations, one academic and one pragmatic. On the academic side, more complex inputs will baffle a keyword system. For example, if the user were to express him/herself as: “Rather than get me the phone number for Bob from the contact list, for now, I only need his address,” the keyword system probably would be unable to process this request.
On the pragmatic side, a possibly greater problem in this approach is that the proper noun “Bob” must be in one's keyword tables in order to be recognized. This implies two critical issues: first, that such keyword systems are only good for information retrieval, and second, at some point, it may not be possible to accommodate all such possible nouns in one's keyword tables.
In light of these short-comings on natural language processing, a third approach has also come into use. The “educating the user” approach often takes the form of a series of restrictions on the type of verbal inputs allowed; with the result being restrictions that require one to speak his/her way through computer menus. These techniques have been available in commercial products for years, but have not achieved widespread acceptance. Many potential users continue to refuse to accept them because they apparently view such restriction methods as being inefficient, overly time consuming and not sufficiently user friendly.
In general, it can be said that when we use language with a computer, we expect the same particular levels of understanding from the computer as when we speak to another human. Correct understanding of natural language input information, which is an easy task for a human being, is clearly not being easily achieved by using any of the current conventional approaches.
Thus, despite the above noted prior art, a need continues to exist for improved methods, systems and algorithms which provide for an interface with a computer. Specifically, an efficient method is needed that can accommodate all of the various ways that a user might express a concept in trying to communicate with a computer.