The present invention relates to natural language understanding (NLU) systems, and more particularly to systems for understanding natural language
Reference is made herein to various prior art references:
(1) Bates, M. 1978. "The Theory and Practice of Augmented Transition Network Grammars". In L. Bolc (ed.), Natural Language Communication with Computers. New York: Springer. PA1 (2) Boguracv, B. 1983. "Recognizing Conjunctions within the ATN Framework. In K. Sparck Jones and Y. Wilks (Eds.), Automatic Natural Language Parsing. New York: Halsted Press PA1 (3) Cook, W. 1979. Case Grammar: Development of the Matrix Model. Washington DC: Georgetown University Press PA1 (4) Cruse, D. A. 1986. Lexical Semantics. Cambridge University Press, Cambridge, England. PA1 (5) Dyer, M. 1983. In-Depth Understanding. Cambridge, MA: MIT Press PA1 (6) Jespersen, O. 1964. Essentials of English Grammar. University, AL: University of Alabama Press PA1 (7) Laffal, J. 1973. A Concept Dictionary of English. Essex, CT: Gallery Press PA1 (8) Lebowitz, M. 1983. "Memory-Based Parsing", Artificial Intelligence, Vol. 21, pp 363-404. PA1 (9) Marcus, M. 1980. Theory of Syntactic Recognition for Natural Language. Cambridge, MA: MIT Press. PA1 (10) Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. l985. A Comprehensive Grammar of the English Language. New York: Seminar Press PA1 (11) Sager, N. 1981. Natural Language Information Processing: A Computer Grammar of English and Its Applications. Reading, MA: Addison-Wesley PA1 (12) Schank, R. 1975. Conceptual Information Processing. New York: North-Holland. PA1 (13) Wilks, V., Huang, X., and Fass, D. 1985. "Syntax, Preference, and Right Attachment", Proceedings of the Ninth IJCAI. PA1 (14) Winograd, T. 1983. Language as a Cognitive Process, vol. 1: Syntax. Reading, MA: Addison-Wesley. PA1 (15) Winston, Morton E.; Chaffin, Roger; and Herrmann, Douglas. 1987. "A Taxonomy of Part-Whole Relations" in Cognitive Science, Vol. 11, pp. 417-444. PA1 (16) Winston, P. and Horn, B. 1984. LISP. 2nd ed. Reading, MA: Addison-Wesley. PA1 (17) Woods, W. 1970. "Transition Network Grammars for Natural Language Analysis". Communications of the ACM, Vol. 13, No. 10, pp. 591-606. PA1 (18) Woods, W., Kaplan, R. and Nash-Weber, B. 1972. The Lunar Sciences Natural Language Information System: Final Report. Cambridge, MA: Bolt Beranek and Newman, Inc. PA1 (19) Xerox Corporation 1986. Interlisp-D Reference Manual. Pasadena, CA: Xerox Artificial Intelligence Systems Division.
In the last decade, some headway has been made in the area of data bases to provide information online. This allows for the easy application of statistical and other algorithmic aids to the data. Much of the current work to enhance the usefulness of these systems, to make them more "user friendly", is being performed under the broad heading of Artificial Intelligence. A subdomain of this technology is the area of Natural Language Understanding (NLU). The assumption is that communication with machines would be much easier if only one could use natural language in accessing information. This field is called data base retrieval (or data base query) and is the area to which most NLU work is being applied.
However, there is another NLU application that is less publicized but much more important. Even if the information in a data base is readily accessible, how accurate and timely is that information For example, in message processing applications, many messages arrive at an intelligence center in an unformatted, "free text" form (i.e., natural language). No present NLU system can account for all of English, and in order to accomplish any useful work with such a system, it is built with a specific, limited task in mind. The linguistic structures and vocabulary that a system can handle are specifically targeted to an application domain and expected text input format. A special use of language peculiar to a domain is often referred to as a "sublanguage", a term encompassing dialects and jargons. A significant part of an NLU developer's job is to discover the characteristics of a sublanguage and specify them for the requirements of an NLU development system.
Various NLU methodologies have been proposed. Many of these center on one particular aspect of a problem, such as conceptual analysis, syntax, or knowledge about specific words. The present invention involves a hybrid approach incorporating all of these aspects.
Quirk et al. 1985 contains a useful discussion of word morphology. This reference, Jespersen 1964 and Sager 1981 all provide significant information concerning grammar specification in natural language processing. Particularly pertinent to the technique of using augmented transition networks (ATN) for grammar specification are Bates (1978) and Winograd (1983). Neither reference, however, discloses a methodology for adapting ATNs to a graphical programming environment.
Prior art references dealing with conceptual analysis include Schank 1975 and Lebowitz 1983 (which discuss conceptual dependency); Cook 1979 (dealing with case grammar); Wilks et al. 1985 (semantic preferences); and Laffal 1973 (psychology). Dyer 1983 discloses domain-specific pattern matchers for NLU systems.
Accordingly, it is a principal object of the invention to provide an improved approach to the development of NLU systems, particularly as applied to text processing. Such approach should be adaptable to a broad range of linguistic domains, as well as to a variety of applications such as monitoring and sorting electronic mail.