Natural language understanding systems and methods traditionally use strict grammar or statistics.
Grammar based natural language understanding systems and methods typically use a parser to parse a text into a tree, i.e. a hierarchical (“depth”) structure. Elements of the tree are processed in a hierarchical manner, either bottom up or top down. In order to achieve successful understanding of the text, the sentence structure/grammar generally needs to conform to rules, thereby placing constraints on the freedom of expression of the submitter of the text.
Statistically based natural language understanding systems and methods typically use many statistical methods including classification to understand a text. Freedom of expression by the submitter of the text is therefore enhanced.
Systems of the related art include the following:
U.S. Pat. No. 5,680,511 to Baker, et al. in one aspect, provides word recognition systems that operate to recognize an unrecognized or ambiguous word that occurs within a passage of words. The system can offer several words as choice words for inserting into the passage to replace the unrecognized word. The system can select the best choice word by using the choice word to extract from a reference source, sample passages of text that relate to the choice word. For example, the system can select the dictionary passage that defines the choice word. The system then compares the selected passage to the current passage, and generates a score that indicates the likelihood that the choice word would occur within that passage of text. The system can select the choice word with the best score to substitute into the passage. The passage of words being analyzed can be any word sequence including an utterance, a portion of handwritten text, a portion of typewritten text or other such sequence of words, numbers and characters. Alternative embodiments of the present invention are disclosed which function to retrieve documents from a library as a function of context.
U.S. Pat. No. 5,642,519 to Martin provide a unified grammar for a speech interpreter capable of real-time speech understanding for user applications running on a general purpose microprocessor-based computer. The speech interpreter includes a unified grammar (UG) compiler, a speech recognizer and a natural language (NL) processor. The UG compiler receives a common UG lexicon and unified grammar description, and generates harmonized speech recognition (SR) and NL grammars for the speech recognizer and natural language processor, respectively. The lexicon includes a plurality of UG word entries having predefined characteristics, i.e., features, while the UG description includes a plurality of complex UG rules which define grammatically allowable word sequences. The UG compiler converts the complex UG rules (complex UG rules include augmentations for constraining the UG rules) into permissible SR word sequences and SR simple rules (simple rules do not include any augmentation) for the SR grammar. The SR grammar is a compact representation of the SR word entries corresponding to the UG word entries, permissible SR word sequences and simple SR rules corresponding to the augmentations of the complex UG rules. The NL grammar provides the NL processor with NL patterns enabling the NL processor to extract the meaning of the validated word sequences passed from the speech recognizer.
U.S. Pat. No. 5,991,712 also to Martin teaches that improved word accuracy of speech recognition can be achieved by providing a scheme for automatically limiting the acceptable word sequences. Speech recognition systems consistent with the present invention include a lexicon database with words and associated lexical properties. The systems receive exemplary clauses containing permissible word combinations for speech recognition, and identify additional lexical properties for selected words in the lexicon database corresponding to words in the received exemplary clauses using lexical property tests of a grammar database. Certain lexical property tests are switchable to a disabled state. To identify the additional lexical properties, the exemplary clauses are parsed with the switchable lexical property tests disabled to produce an index of the lexical properties corresponding to the exemplary clauses. The lexicon database is updated with the identified additional lexical properties by assigning the lexical properties to the corresponding words of the lexicon database. The grammar database is compiled with the lexical property tests enabled and the lexicon database with the assigned lexical properties to produce a grammar that embodies constraints of the lexical property tests and the lexical properties.
U.S. Pat. No. 5,918,222 to Fukui, et al. teaches a data storage means for storing data in a predetermined information form. An information retrieval means retrieves the data stored in the data storage means. A reception means receives an information disclosure demand from a demander, a response rule storage means for storing general knowledge for generating a response responding to the demander and a personal relationship information associated with a unique personal relationship between a user having the data on an information provider side and a user on an information demander side. A response plan formation means, responsive to the demand received by the reception means, plans a response for exhibiting, to the information demander, data obtained by causing the retrieval means to retrieve the data stored in the data storage means on the basis of the knowledge and the personal relationship information stored in the response rule storage means. A response generation means generates the response to the information demander in accordance with the plan formed by the response plan formation means.
U.S. Pat. No. 5,987,404 to Della Pietra, et. al. proposes using statistical methods to do natural language understanding. The key notion is that there are “strings” of words in the natural language, that correspond to a single semantic concept. One can then define an alignment between an entire semantic meaning (consisting of a set of semantic concepts), and the English. This is modeled using P(E,A|S). One can model p(S) separately. This allows each parameter to be modeled using many different statistical models.
U.S. Pat. No. 5,576,954 to Driscoll teaches a procedure for determining text relevancy and can be used to enhance the retrieval of text documents by search queries. This system helps a user intelligently and rapidly locate information found in large textual databases. A first embodiment determines the common meanings between each word in the query and each word in the document. Then an adjustment is made for words in the query that are not in the documents. Further, weights are calculated for both the semantic components in the query and the semantic components in the documents. These weights are multiplied together, and their products are subsequently added to one another to determine a real value number (similarity coefficient) for each document. Finally, the documents are sorted in sequential order according to their real value number from largest to smallest value. Another, embodiment is for routing documents to topics/headings (sometimes referred to as filtering). Here, the importance of each word in both topics and documents are calculated. Then, the real value number (similarity coefficient) for each document is determined. Then each document is routed one at a time according to their respective real value numbers to one or more topics. Finally, once the documents are located with their topics, the documents can be sorted. This system can be used to search and route all kinds of document collections, such as collections of legal documents, medical documents, news stories, and patents.
U.S. Pat. No. 5,642,502 also to Driscoll teaches a system and method for retrieving relevant documents from a text data base collection comprised of patents, medical and legal documents, journals, news stories and the like. Each small piece of text within the documents such as a sentence, phrase and semantic unit in the data base is treated as a document. Natural language queries are used to search for relevant documents from the data base. A first search query creates a selected group of documents. Each word in both the search query and in the documents are given weighted values. Combining the weighted values creates similarity values for each document which are then ranked according to their relevant importance to the search query. A user reading and passing through this ranked list checks off which documents are relevant or not. Then the system automatically causes the original search query to be updated into a second search query which can include the same words, less words or different words than the first search query. Words in the second search query can have the same or different weights compared to the first search query. The system automatically searches the text data base and creates a second group of documents, which as a minimum does not include at least one of the documents found in the first group. The second group can also be comprised of additional documents not found in the first group. The ranking of documents in the second group is different than the first ranking such that the more relevant documents are found closer to the top of the list.
U.S. Pat. No. 5,893,092 also to Driscoll teaches a search system and method for retrieving relevant documents from a text data base collection to comprised of patents, medical and legal documents, journals, news stories and the like. Each small piece of text within the documents such as a sentence, phrase and semantic unit in the data base is treated as a document. Natural language queries are used to search for relevant documents from the data base. A first search query creates a selected group of documents. Each word in both the search query and in the documents are given weighted values. Combining the weighted values creates similarity values for each document which are then ranked according to their relevant importance to the search query. A user reading and passing through this ranked list checks off which documents are relevant or not. Then the system automatically causes the original search query to be updated into a second search query which can include the same words, less words or different words than the first search query. Words in the second search query can have the same or different weights compared to the first search query. The system automatically searches the text data base and creates a second group of documents, which as a minimum does not include at least one of the documents found in the first group. The second group can also be comprised of additional documents not found in the first group. The ranking of documents in the second group is different than the first ranking such that the more relevant documents are found closer to the top of the list.
U.S. Pat. No. 6,088,692 also to Driscoll teaches a natural language search system and method for retrieving relevant documents from a text data base collection comprised of patents, medical and legal documents, journals, news stories and the like. Each small piece of text within the documents such as a sentence, phrase and semantic unit in the data base is treated as a document. Natural language queries are used to search for relevant documents from the data base. A first search query creates a selected group of documents. Each word in both the search query and in the documents are given weighted values. Combining the weighted values creates similarity values for each document which are then ranked according to their relevant importance to the search query. A user reading and passing through this ranked list checks off which document are relevant or not. Then the system automatically causes the original search query to be updated into a second search query which can include the same words, less words or different words than the first search query. Words in the second search query can have the same or different weights compared to the first search query. The system automatically searches the text data base and creates a second group of documents, which as a minimum does not include at least one of the documents found in the first group. The second group can also be comprised of additional documents not is found in the first group. The ranking of documents in the second group is different than the first ranking such that the more relevant documents are found closer to the top of the list.
U.S. Pat. No. 5,694,592 also to Driscoll teaches a procedure for determining text relevancy that can be used to enhance the retrieval of text documents by search queries. This system helps a user intelligently and rapidly locate information found in large textual databases. A first embodiment determines the common meanings between each word in the query and each word in the document. Then an adjustment is made for words in the query that are not in the documents. Further, weights are calculated for both the semantic components in the query and the semantic components in the documents. These weights are multiplied together, and their products are subsequently added to one another to determine a real value number (similarity coefficient) for each document. Finally, the documents are sorted in sequential order according to their real value number from largest to smallest value. Another, embodiment is for routing documents to topics/headings (sometimes referred to as faltering). Here, the importance of each word in both topics and documents are calculated. Then, the real value number (similarity coefficient) for each document is determined. Then each document is routed one at a time according to their respective real value numbers to one or more topics. Finally, once the documents are located with their topics, the documents can be sorted. This system can be used to search and route all kinds of document collections, such as collections of legal documents, medical documents, news stories, and patents.
U.S. Pat. No. 6,138,085 to Richardson, et al. teaches a facility for determining, for a semantic relation that does not occur in a lexical knowledge base, whether this semantic relation should be inferred despite its absence from the lexical knowledge base. This semantic relation to be inferred is preferably made up of a first word, a second word, and a relation type relating the meanings of the first and second words. In a preferred embodiment, the facility identifies a salient semantic relation having the relation type of the semantic relation to be inferred and relating the first word to an intermediate word other than the second word. The facility then generates a quantitative measure of the similar in meaning between the intermediate word and the second word. The facility further generates a confidence weight for the semantic relation to be inferred based upon the generated measure of similarity in meaning between the intermediate word and the second word. The facility may also generate a confidence weight for the semantic relation to be inferred based upon the weights of one or more paths connecting the first and second words
U.S. Pat. No. 5,675,710 to Lewis teaches a method and apparatus for training a text classifier. A supervised learning system and an annotation system are operated cooperatively to produce a classification vector which can be used to classify documents with respect to a defined class. The annotation system automatically annotates documents with a degree of relevance annotation to produce machine annotated data. The degree of relevance annotation represents the degree to which the document belongs to the defined class. This machine annotated data is used as input to the supervised learning system. In addition to the machine annotated data, the supervised learning system can also receive manually annotated data and/or a user request. The machine annotated data, along with the manually annotated data and/or the user request, are used by the supervised learning system to produce a classification vector. In one embodiment, the supervised learning system comprises a relevance feedback mechanism. The relevance feedback mechanism is operated cooperatively with the annotation system for multiple iterations until a classification vector of acceptable accuracy is produced. The classification vector produced by the invention is the result of a combination of supervised and unsupervised learning
U.S. Pat. No. 6,311,152 to Bai, et. al teaches a system (100, 200) for tokenization and named entity recognition of ideographic language. In the system, a word lattice is generated for a string of ideographic characters using finite state grammars (150) and a system lexicon (24). Segmented text is generated by determining word boundaries in the string of ideographic characters using the word lattice dependent upon a contextual language model (152A) and one or more entity language models (152B). One or more named entities is recognized in the string of ideographic characters using the word lattice dependent upon the contextual language model (152A) and the one or more entity language models (152B). The contextual language model (152A) and the one or more entity language models (152B) are each class-based language models. The lexicon (240) includes single ideographic characters, words, and predetermined features of the characters and words.
What is needed in the art is a method and system for understanding natural language that includes inter alia statistical steps and elements which also take advantage of hierarchical-structure. What is also needed in the art is a system and method where the extraction of one part of a text which belongs to one semantic category assists in the extraction of another part which belongs to a semantic category of a different hierarchical level. In addition, what is needed in the art is a method and system for understanding natural language where later steps of the process are affected based on the results of earlier steps, thereby introducing a dynamic aspect to the method and system.