Natural language understanding (NLU) applications are applications that utilize computing machinery to produce actionable information from processing source texts written in natural language. Typically, NLU applications will process one or more source texts, written in one or more natural languages, and in conjunction with a stored dataset of domain knowledge, generate actionable information. Examples of general NLU application categories include machine translation, question answering, and automated summarization, among many others. Domain-specific examples of NLU applications include medical diagnosis systems, quantitative trading algorithms, and web search, among many others.
One early attempt at building an NLU application in the broad domain of commonsense reasoning was undertaken by the CYC project (Lenat et al, 1989). The goal of the CYC project was to construct a knowledge base of common sense facts that would enable an NLU system to parse as the source text a typical desk encyclopedia into actionable knowledge. The CYC experiment employed specifically trained technicians that would manually enter the common sense facts. Despite the high expense of human effort required to construct the knowledge base, the project was unsuccessful, to this date, in achieving its goal, illustrating the difficulties in constructing complete knowledge bases by manual means.
Thus, many recent techniques and approaches for implementing NLU systems focus on either restricting the domain of the problem space or utilizing automatic means to derive various sorts of asserted or non-asserted relations. However, in these conventional techniques, the actionable information produced by such systems is significantly lacking in accuracy and completeness compared to information capable of being produced by human processing.
One approach to implementing practical NLU applications is to restrict the domain of the problem. This may involve applying restrictions in the scope of the source text or of the output in order to simplify the types of information that are produced and processing techniques required. For example, U.S. Pat. No. 5,721,938, entitled “Method and Device for Parsing and Analyzing Natural Language Sentences and Text”, teaches a method for parsing natural language source texts that categorizes words as either noun or verb units. The method is designed for the domain of grammar checker applications, and is not suitable for implementation of other broader NLU applications.
Another approach to implementing practical NLU applications relies on generating output information that is short of full understanding by employing approximate methods. For example, a conventional system for translating a source text into another natural language that generates the literal translation of the source text will commonly produce resultant translations that are erroneous or approximate.
Some NLU systems utilize statistical methods to approximate understanding of the source text when complete understanding is not achievable. For example, U.S. Pat. No. 5,752,052, entitled “Method and System for Bootstrapping Statistical Processing into a Rule-based Natural Language Parser”, discloses a method of modifying a rule-based natural language parser using summary statistics generated from a source text. The summary statistics are compiled from a corpus of text that is similar in syntactic properties to the source text in order to estimate the likelihoods that candidate rules should be applied. Using these statistics to implement a rule-based parser thereby results in output that can be erroneous or approximate.
Therefore, what is desired is a general-purpose, accurate, and complete method for natural language understanding capable of delivering actionable information that is suitable to be used in a broad range of NLU applications.