Natural language understanding is a topic that is attracting a lot of attention since it eventually will allow improvement of the interface and interaction between humans and machines, such as computers, personal digital assistants (PDAs), or cellular phones, for example. Natural Language Processing (NLP) is vital in natural language interfaces, machine translation, automatic abstracting and in a number of other computer-based applications.
Despite a lot of efforts, the advances in natural language understanding are not very satisfying. In particular, the research area of automated apprehension of meaning from speech or from text has not made much progress over the last years.
In the area of automated apprehension of meaning there are two different philosophies, the “holistic” approach and the “componentized” approach. While the latter approach has achieved a great number of detailed results in the individual components such as anaphora resolution and deep syntactic analysis, among many others, it has not yet achieved the goal of combining these individual components to one global and overall solution within which these components can interact. It is even unclear at the moment whether there exists for these components any suitable form of combination that includes feedback mechanisms.
When it comes to understanding the meaning of speech, i.e. the semantical interpretation of speech, the breakthrough has not yet been achieved. As a consequence, the pragmatical analysis, the control of tools and devices by natural speech, has also not been developed very far. A typical example of a modem speech/text recognition system is described in the article “Enabling agents to work together”, by R. V. Guha et al., Communications of the ACM, Vol. 37, No. 7, July 1994, pp. 127-142, and reviewed by T. J. Schult in the German article “Transparente Trivialitäten; Cyc-Wissensbasis in WWW”, c't, 1996, Vol. 10, pp. 118-121. The Cyc-system described by R. V. Guha is a knowledge based system for true/false categorization of input statements. T. J. Schult points out in his article that the knowledge representation in the database used in the Cyc-system is not standardized and uses only the following relations for deduction: ‘is element of’, ‘is subset of’, and ‘has subsets’. The system described by Guha is what we call a “holistic” system.
In the present context, we focus on the “holistic” approach that is also referred to as the cognitive approach. A fractal semantic knowledge database is employed in order to be able to perform a meaning understanding task. This kind of an approach has been used by others before, but the present model to represent the world knowledge in a knowledge database modeled as a fractal semantic network is unique and has a number of important differences compared to all other models around. In particular, a self-similar hierarchical network of n-th order is employed, as for example disclosed and claimed in PCT Patent application WO 99/63455, International Application No.: PCT/IB99/00231, entitled “Processing of Textual Information and Automated Apprehension of Information”, currently assigned to the assignee of the present application. Furthermore, local pervasive intelligence is employed, as specified in the German Patent application “Fraktales Netz n-ter Ordnung zum Behandeln komplexer Strukturen”, application No.: 199008204.9, filing date 15 February 1999, assigned to the assignee of the present application and to Delphi Creative Technologies. This local pervasive intelligence is called a Janus, using a name of the Roman mythology (plural: Jani). The Janus is described in very general terms in this German patent application. These two patent applications are incorporated in their entirety.
It is to be noted that there is a fundamental difference between data or information and knowledge. One can accumulate arbitrary amounts of data or information without having any knowledge, while the converse is not possible. Data or information is the isolated representation of pure facts, while knowledge arises from strong connections between these facts, from connections between facts and their environment, and from abstraction, which in turn allows both for performing understanding and learning.
An approach for the meaning understanding based on a fractal semantic knowledge base is described and claimed in co-pending patent application with title “MEANING UNDERSTANDING BY MEANS OF LOCAL PERVASIVE INTELLIGENCE”. This patent application was filed on the same day and is currently assigned to the same applicant as the instant patent application. This patent application is incorporated in its entirety. According to this co-pending case, local pervasive intelligence (realized by means of Janus objects) is employed in order to process an input network. During the processing of this input network, knowledge is extracted from the knowledge base. This enables such a system to automatically apprehend (understand) what was conveyed in the input network. Before such a meaning understanding task can be carried out, some preparational work is required.
Linguists and programmers have developed and are developing parsers that are able to parse subsets of a language. So far, modern linguistic theories did not lead to parser implementations that have enough lexical information to be able to parse a substantial subset of the English language, for instance. This is due to the fact that linguistic theories cannot deal with all exceptions that a natural language contains, and, therefore, the parsing sometimes fails, generating either wrong or corrupt outputs.
One parser giving good results is the English Slot Grammar (ESG) parser developed by Michael McCord of International Business Machines Corporation.
It would be desirable to provide an efficient scheme for the conversion of an input string or input text into an input network suited for meaning understanding.
It would also be desirable to provide a system for the efficient conversion of an input string or input text into an input network suited for meaning understanding.