The present invention relates to natural language sentence realization. More specifically, the present invention relates to a system and method for realizing sentences wherein the system and method are complete for a general class of unification grammars.
In natural language processing, grammars describe a syntactic structure which is a breakdown of phrases, and a description of how those phrases combine into larger units, such as sentences. One grammar formalism has the expressive power of definite clause grammar (such as that described in F. Pereria and S. Shieber Prolog and Natural-Language Analysis, Center for the Study of Language and Information, Stanford University, Stanford Calif., (1987)). Another such grammar is provided in a syntactically modified form and is described in H. Alshawi, The Core Language Engine, The MIT Press, Cambridge, Mass. (1992), or J. Dowding et al., Gemini: A Natural Language System For Spoken-Language Understanding, Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pp. 54–61, Columbus, Ohio (1993).
An example of a grammar rule (which may also be called a grammar production) found in this sort of grammar is as follows:                s:[stype=decl]→np:[prsn=P, num=N]vp:[vtype=tensed, prsn=P, num=N]        
This notation reflects that of an augmented phrase structure rule, where nonterminals are complex category expressions having the form of a major category symbol followed by a bracketed list of feature constraints. Atomic values beginning with uppercase letters are variables, while those beginning with lowercase letters are constants.
The bracketed list of feature constraints are of the form “feature=value”. In the class of grammars referred to as unification grammars, the constraints can be expressed in a more abstract manner than simply providing a specific value for a feature. Instead, the constraints can be expressed as a variable value which can also appear in other places in the grammar rule. Thus, unification constraints are indicated by shared variable values.
More specifically, the sample rule given above is interpreted to mean that a sentence (represented by “s”) is of a declarative sentence type (represented by the bracketed feature constraint “stype-decl”), and the declarative sentence can be a noun phrase (represented by “np”) followed by a tensed verb phrase (indicated by “vp” with the feature constraint “vtype=tensed”). The rule also indicates that the person and number of the noun phrase are equal to the person and number of the verb phrase, respectively (which is indicated by the “prsn=P” and the “num=N” expressions in both the noun phrase and verb phrase).
The types of grammar formalisms discussed herein extend from that given above to a formalism that not only describes the syntactic structure but also maps between the syntactic structure and a semantic representation of the syntactic structure (i.e., it maps to the meaning of the syntactic structure). The present discussion proceeds by using, as one example of such a semantic representation, a logical form. However, it should be noted that any other semantic representation can be used as well, and the present invention is not to be limited to a logical form semantic representation.
To extend the formalism of the grammar rule written above to incorporate semantic specifications, a principal logical form is assigned to each phrase. The nonterminals are thus augmented with a logical form (LF) specification separated by the symbol “/”, as follows:                s:[stype=decl]/VP_se→np:[prsn=P, num=N]/NP_sem vp:[vtype=tensed, prsn=P, num=N, sub=NP13 sem]/VP_sem        
The rule now states that the LF of the sentence (“VP_sem”) is the same as the LF of the verb phrase (“VP_sem”). The rule also states that the LF of the noun phrase (“NP_sem”) is unified with the “sub” (which stands for “subject”) feature in the bracketed list of features of the verb phrase. Thus, the assumption is that the verb phrase has as its LF something that looks the same as the LF of an entire sentence, except that the subject has not yet been specified. The subject will be represented in the LF by a variable, the value of which is the overall semantic representation (i.e., the LF) of the noun phrase.
From this example, it can be seen that phrases will have a principal LF, but can also have LF-valued features, such as the “sub” feature in the verb phrase. These LF-valued features can be used to pass information into the phrase. All such LF-valued features are declared as such by the author of the grammar. Collectively, the principal LF and LF-valued features of a nonterminal will be referred to as the LF components of the nonterminal.
Lexical items (i.e., words) are introduced by rules such as the following:                vp:[vtype=tensed, prsn=3, num=sg, sub=S]/sleep(S)→‘sleeps’        np:[prsn=3, num=sg]/sue→‘Sue’        
The first of these rules says that “sleeps” is a third person, singular, tensed verb phrase, whose LF is of the form “sleep(S)”, where “S” is the value of the “sub” feature. The second rule says that “Sue” is a third person, singular noun phrase, whose LF is “sue”.
The notation used above is one particular notation for describing grammars that specify meanings of linguistic expressions. Note that the methods discussed herein, including the methods of the present invention, can be applied to grammars using a wide variety of notations, and that the notation used herein is merely exemplary.
The term “unification” as used herein refers to matching two expressions by finding a most general substitution instance of the expressions, which can be partially specified. For example, the following two term expressions are partially specified (meaning that they have variables in them):                f(X,g(X),U)        f(a,g(Y),V)        
The process of unifying these two expressions is the process of finding a most general substitution for the variables that make the two expressions identical. In this case, in order to make the two expressions identical, the following substitutions must be made:                X=Y=a; and        U=V=z        
It should be noted that since both “U” and “V” are variables, they need not be substituted with a particular value, but must be replaced by a common variable reference. Threfore, the unification of the two expressions identified above is written as follows:                f(a,g(a),Z)        
It should be noted that in unifying two or more variables, any variable may be substituted for the original variables as long as that variable does not have an occurrence anywhere else in the larger expressions being unified that is not required to be unified with those variables. For example, if we unify “f(A,B)” with “f(C,D)”, we can use any variables we like to unify “A” with “C” and “B” with “D”, as long as we do not use the same variable. Thus we can express the result of the overall unification as “f(A,B)”, “f(C,D)”, “f(E,F)”, etc., but not “f(A,A)”, “f(B,B)”, etc. The most general substitution instance unifying two terms can be proved to be unique, except for this freedom in choosing names for the variables. A substitution function that unifies two expressions as described herein is referred to as a “most general unifier”.
It should also be noted that when a subexpression of a larger expression is unified or instantiated, in such a way that variables in the subexpression become instantiated (i.e., receive values), all occurrences of those variables in the larger expression are simliarly instantiated. For example, when we speak herein of a logical form component of an edge or a rule being unified or instantiated, it should be understood that any occurrences of variables within the rule or edge, but outside the logical form component, are instantiated to the same value they receive as a result of occurrences of those variables being unified or instantiated inside the logical form component.
The type of unification described above is called “term unification”. We also require the notion of “feature structure unification” to unify linguistic category expressions incorporating “feature=value” constraints. While terms are unified by unifying corresponding parts identified by position, feature structures are identified by unifying corresponding parts identified by feature name. Moreover, features not explicitly mentioned in a feature structure are interpreted to be unconstrained, which is equivalent to having as a value a variable which occurs nowhere else.
For example, unifying the terms “f(A,B)” and “f(x,y)” requires unifying “A” with “x” and “B” with “y”, because they occupy corresponding positions in the overall terms. Unifying the category expressions “c:[f1=A, f2=B, f3=foo]” and “c:[f2=x, f1=y, f4=bar]” requires unifying “A” with “y” and “B” with “x”, because they are values of corresponding features, even though they are not in corresponding positions as we have chosen to write these category expressions. Moreover, the resulting expression would incorporate the constraints “f3=foo” and “f4=bar”, which could be written as “c: [f1=x, f2=y, f3=foo, f4=bar]”, or alternatively written using any other permutation of the given constraints on the features “f1”, “f2”, “f3”, and “f4”.
In all the examples we give below, we assume that all feature values are terms to be unified by term unification, but the same methods apply if feature values are allowed to be feature structures, so long as feature structure unification is used in place of term unification.
It should also be noted that, as is well known to those skilled in the art, feature unification can be replaced by term unification if all feature structures are converted to terms, by assigning each feature a fixed position in a term structure. The corresponding feature values are assigned these positions, and the feature names are omitted. For example the expression “c: [f1=a, f2=a]” can be replaced by “c(a,b,X,Y)”, if the features “f1”, “f2, “f3” and “f4” are always respectively assigned the first through fourth argument position of a term headed by the functor “c”, and these are the only features associated with the functor “c”.
As a more concrete example of unification in the context of unification grammar, for the two rules set out above that introduce the words “sleeps” and “Sue”, the nonterminal expression for “Sue” can be unified with the noun phrase daughter of the sentence rule, and the nonterminal expression for “sleeps” can be unified with the verb phrase daughter of the sentence rule. This will cause the principal LF of the noun phrase (“Sue”) to be unified with the “sub” feature of the verb phrase, which will instantiate the principal LF of the verb phrase to “sleep(sue)” which will in turn become the LF of the entire sentence.
This type of unification-based grammar can be used for parsing employing any of a number of well-known methods. Such a grammar will associate every well-formed sentence with one or more semantic representations (e.g., LFs) representing possible meanings of the sentence. Such grammars can also be used for sentence realization, that is, given a well-formed LF, realizing one or more text strings whose meaning is represented by the LF.
A large body of work has been done on both types of algorithms (both parsing and realization). Similarly, work has been done in an attempt to develop algorithms such that a single grammar can be used for both parsing and realization. One realization algorithm that is designed to use a grammar that can also be used for parsing is described in S. Shieber, A Uniform Architecture for Parsing and Generation, Proceedings of the 12th International Conference on Computational Linguistics, pp. 614–619, Budapest, Hungary (1988).
In order to discuss the Shieber algorithm in greater detail, a brief description of the concepts of charts and edges should be made. These concepts are well-known in chart parsing. The chart can be any a data structure that stores records (edges—also called items, dotted rules, or states) that record partial analyses of a segment of an input string. These partial analyses are combined to reach a final analysis of the entire string.
In the present context, the chart and edges are slightly different, because the goal is to start with an LF and build up phrase records for it to obtain a sentence (or other output sequence) that has the specified LF as its meaning. Therefore, edges in the chart represent grammatical phrase types that can realize a particular portion of an LF. To generate a sentence, the system starts with a goal LF and each edge in the chart will have a piece of the goal LF as its meaning. Therefore, when the analysis is completed, the edges in the chart can be traced to find all individual words in the order that they must be in, in order to construct the text string.
The algorithm described in the Shieber reference is based on a predictive algorithm for parsing such as that set out in J. Earley, An Efficient Context-Free Parsing Algorithm, Communications of the ACM, 13(2) 94–102 (1970). However, when used for sentence realization, this type of predictive algorithm frequently fails to pass along any semantic constraints. Also, in Shieber's algorithm, at all stages of processing, Shieber checks the principal LF of a phrase to ensure that it is unifiable with some goal LF subexpression, but the algorithm does not instantiate edges in this process. This has two significant disadvantages. First, it greatly increases the number of possible distinct completed and complete edges, since for every possible full instantiation of an LF component of a completed edge, Sheiber also allows all possible generalizations of that instantiation. Second, Sheiber's algorithm also greatly increases the number of LF expressions that must be examined to ensure compatablilty with the goal LF. Since the LF expressions remain only partially instantiated, they must be rechecked as they are percolated from edge to edge, since they might become further instantiated in ways incompatable with any goal LF subexpression.
In sum, given a unification-based grammar that associates every well-formed sentence with one or more logical forms (LFs), it has been very difficult to develop a general algorithm that efficiently enumerates all the well-formed sentences that have a given LF as the representation of their meaning.