The explosion of information has created an unfulfilled demand for automated processing of natural language documents. Such an ability would enable natural language interfaces to databases, automated generation of extracts and summaries of natural language texts, and automated translation and interpretation of natural language. Development of these technologies is hindered by the time required to process modern grammatical formalisms.
Many modern grammatical formalisms use recursive feature structures to describe the syntactic structure of natural language utterances. (See Appendix A for definitions of technical terms used herein.) For instance, Lexical-Functional Grammar (Kaplan and Bresnan 1982), Functional Unification Grammar (Kay 1979), HPSG (Pollard and Sag 1987) and Definite Clause Grammars (Pereira and Warren 1980) all use recursive feature structures as a major component of grammatical descriptions. Feature structures have the advantage of being easy to understand and easy to implement, given unification-based programming languages such as Prolog. However, they have the disadvantage of making the resulting grammatical formalisms difficult to parse efficiently, both in theory and in practice. In theory grammatical formalisms that use arbitrary recursive feature structures can be undecidable in the worst case (Blackburn and Spaan 1993). Even with suitable restrictions, such as the off-line parsability constraint of LFG, the formalisms can be exponential in the worst case (Barton, Berwick, and Ristad 1987). Although in practice the sorts of phenomena that make a formalism take exponential time are rare, untuned unification-based parsers commonly take minutes to parse a moderately complex sentence.
There have been a number of different approaches to making unification-based parsers run faster. One approach has been to focus on making unification itself faster. In addition to the general work on efficient unification (Knight 1989), there has been work within the computational linguistics community on undoable unification (Karttunen 1986), lazy copying (Godden 1990), and combinations thereof (Tomabechi 1991). Another approach has been to focus on the problem of disjunction, and to propose techniques that handle special cases of disjunction efficiently. For instance, disjuncts that are inconsistent with the non-disjunctive part can be eliminated early (Kasper 1987). Also, certain types of disjunctions can be efficiently processed by being embedded within the feature structure (Karttunen 1984; Bear 1988; Maxwell and Kaplan 1989; Dorre and Eisele 1990). There has also been some work on the context-free component of grammatical formalisms. It has been shown that the granularity of phrase structure rules has an important effect on parsing efficiency (Nagata 1992). Also, the strategies used for handling the interaction between the phrasal and functional components can make a surprising difference (Maxwell and Kaplan 1993).
Lazy copy links are another way of reducing the processing time associated with unification. They do so by reducing the amount of copying required by a unification-based chart parser (Godden 90). Whenever feature structures from daughter constituents get unified together, the feature structures must be copied to prevent cross-talk. This is because the daughter feature structures may be used in other analyses of the sentence. However, copying feature structures is very expensive. Thus, in 1990 Godden proposed lazy copying of feature structures. With lazy copying at first just the top levels of each feature structure are copied. At the fringe of what has been copied, standard lazy copy links point back to the material that hasn't been copied yet.
Contexted unification is another method of reducing the processing time required for unification. Contexted unification is a method for merging alternative feature structures together by annotating the various alternatives with propositional variables that indicate the alternatives that they came from. Contexted unification is based on ideas from Assumption-Based Truth Maintenance Systems (deKleer 1986). The following rules formalize the basic idea of contexted unification:
1. .phi..sub.1 .phi..sub.2 is satisfiable if and only if (p.fwdarw..phi..sub.1)(p.fwdarw..phi..sub.2) is satisfiable, where p is a new propositional variable; PA1 2. If .phi..sub.1 .phi..sub.2 .fwdarw..phi..sub.3 is a rule of deduction, then (P.fwdarw..phi..sub.1)(Q.fwdarw..phi..sub.2).fwdarw.(PQ.fwdarw..phi..sub.3 ) is a contexted version of that rule, where P and Q are boolean combinations of to propositional variables; PA1 3. If P.fwdarw.FALSE, then assert P. (In ATMS terminology P is called a nogood.)
If we think of unification as a technique for making term rewriting more efficient by indexing equality relations in a feature tree, then contexted unification can be thought of in the same way, where the propositional variables get indexed with the equality relations, and then unification is performed in light of the above rules. Contexted unification is performed in three stages. First, the disjunctions are converted to conjunctions using Rule 1 above and instantiated as a feature structure. Afterward, feature structures are unified and nogoods are produced. Finally, the nogoods are collected and solved to determine which combinations of propositional variables are valid. This process was described in detail by Maxwell and Kaplan in 1989.
FIG. 1 illustrates a simple example of how contexted unification works. The first feature structure 20 results from instantiating the constraints a ([A+][B-])([C+][C-]). The second feature structure 22 results from instantiating the constraints ([A+][A-])([D+][D-]). Propositional variables p, q, r, and s have been introduced to represent the four disjunctions in the two constraints. Unification of feature structures 20 and 22 yields feature structure 24 and the nogood (pr). To find solutions to feature structure 24, all possible combinations of the propositional variable are computed that are consistent with the known nogood. In this case there are twelve possible solutions:
______________________________________ pqrs pqrs pqrs pqrs pqrs pqrs pqrs pqrs pqrs pqrs pqrs pqrs ______________________________________
The main advantage of contexted unification is the postponement of taking the cross product of alternatives until the nogoods have been found. This is helpful for at least two reasons. First, taking the cross-product of propositional variables is computationally cheaper than taking the cross product of feature structures. Second, contexted unification prevents taking unnecessary cross products because they are not taken until after it is discovered whether the unification is valid. Cross products are not taken if the unification is invalid.
Despite all these different approaches to reducing the processing time required for unification, a need still exists to decrease the total time required to unify feature structures.