1. Field of the Invention
The present invention relates to a technique for analyzing natural language by applying a dependency grammar.
2. Description of the Related Art
A dependency grammar is a grammar which describes a syntax structure, defining a modification relation between two words and its type as basic elements. Available as a publication which discloses a method for analyzing natural language by applying the dependency grammar is "Dependency Grammar Based on Strength of Modification Relation--Restrictive Grammar" (which will be hereinafter referred to as "Publication 1") on the Journal of the Information Processing Society, vol. 33, no. 10, pp. 1211-1223. According to the method described in Publication 1, all possible solutions are attained by effecting a bottom-up depth-first analysis, while writing all possible dependency relations between two clauses into an analysis table, or a chart.
Available as a publication which discloses a method for attaining all possible solutions by performing a bottom-up depth-first analysis is "A New Statistical Parser Based on Bigram Lexical Dependencies" (which will be hereinafter referred to as "Publication 2") in the Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, July 1996. According to the method described in Publication 2, especially in Chapter 3 thereof, a bottom-up chart system is employed as an analyzing algorithm. Of two local analyzing result, priority is given to one having a higher possibility of usage, and the other having a lower possibility of usage is dismissed. The method described in Publication 2 employs the structure which writing words into a chart in a unit of dependency relations connective (that is, the dependency relation connective words having possibility to have linguistic meanings) words in which grammatical rule applied to words except a head word is completed.
Available as a publication which discloses another bottom-up chart system like Publication 2 is "Bilexical Grammars And A Cubic-Time Probabilistic Parser" (which will be hereinafter referred to as "Publication 3") in Proceedings of the International Workshop on Parsing Technologies, MIT, September 1997. The differences between Publication 2 and Publication 3 is a unit to be written into a chart. The method described in Publication 3 employs non-connective (that is, local analysis result to which linguistic meanings is hardly given) aligned result of local analysis in general as the unit of chart writing instead of local analysis result in which application of the grammatical rule is completed. According to this structure, the method of Publication 3 realizes a solution for limiting words at ends in a section to which further grammatical rule should be applied.
Giving priority to meanings for extracting meaning having the highest priority is a general method for clarifying vague meanings contained in natural language. It is difficult to compare priorities of meanings analyzed by the depth-first analysis like the technique disclosed in Publication 1 because the depth-first analysis is the time series analysis. The depth-first analysis also has difficulties for reusing the local analysis result for further analysis. According to those disadvantages, breadth-first analysis like techniques described in Publications 2 or 3 are often used for analyzing the natural language.
It is known a chart method which helps the breadth-first analysis. Available as a publication which discloses algorithm for the chart method is "Fundamentals of Natural Language Processing" written by Hirosato Nomura, published by The Institute of Electronics, Information and Communication Engineers, 1988, Chapter 2, Section 3 (which will be hereinafter referred to as "Publication 4"). The chart method features: controlling analysis order based on dynamic programming scheme; utilizing local analysis result registered in a chart; and classifying local analysis result having different internal structure but have the same grammatical function which will be shown at applying the grammatical rule carried out later, into the same group (this feature is so called "packing"). Because of those features, the chart method helps the breadth-first analysis to carry out an arbitral context-free grammar rule application with the calculation amount of an order of the 3rd power of the number of words in an input sentence.
Detailed explanation of the calculation amount of the chart method will now be described. Basically, the chart method groups local analysis results in adjacent sections into one. Construction analysis will proceed smoothly if the free-context grammar application employs the chart method, because applicability of the grammar rule to the local analysis result depends only on non-terminal symbols in the analysis result.
More precisely, further analysis will be simplified with using packages including local analysis result because word strings in a section can be packed into one regardless how complex structure they have, in a case where the word strings in one section are grouped in the same non-terminal symbol. Thus, the maximum number of the local analysis result in one section is a fixed number. The fixed number does not depend on the number of input words but the number of non-terminal symbols. Accordingly, the calculation amount per one basic calculation is also restricted by a fixed number uniformly. In this case, the maximum amount of calculation is an order of the 3rd power of the number of the words because the number of the basic calculations is equal to the number of combinations of adjacent two sections.
It is known that the maximum amount of calculation will be an order of the 5th power of the number of input words in a case where the chart method is simply applied to the dependency grammar. "apply" means applying grammar rule and packing with the dependency structure as a unit of edges. In this case, the dependency structure has grouped words in which a head word act as a parent word, and the grammar rule application to the words except the head word has been completed. This method is a directly extended method of analyzing context-free grammar using the chart method.
In the dependency grammatical rule, the state of the head word of a dependency structure determine which grammatical rule is applicable to the dependency structure later. However, it is generally unknown that which word in a section is a head word for the dependency structure of the analysis result regarding to the section concerned. Therefore, the maximum number of packed local analysis result in the section may be order of the number of the words in the section. As a result, the amount of calculation will be the 5th power of the number of the words.
To avoid this problem, the method described in Publication 3 employs generally non-connected structure in stead of employing the completed local structure including a head word acting as an edge i.e. a unit of words to be registered in a chart. The method determines the edges so that only a start word and an end word of the section determines grammatical function of the structure. Thus determined edges limits the number of functions (the number of cases) for further grammatical rule application to the analysis result in a section. In this case, the number of functions is a fixed number represented by the product of the number of states regarding to the grammatical rule applied to the start and end words of the section. This fixed number does not depend on the number of the words. As a result, the amount of calculation to obtain full result by the method of Publication 3 will also be an order of the 3rd power of the number of input words like the case of context-free grammatical rule application.
Analyzing method of Publication 3 will now be described with reference with a program list shown in FIG. 35. The program list shown in FIG. 35 is quoted from section 4.3 of Publication 3. In the program list, contents described in "(* . . . )" at line ends of lines 4, 9-14, and 18 represent comments.
A chart including words (nodes) is the basic data structure of the method described in Publication 3. The edges retain the start and end words, and words therebetween, that is, full information of the dependency relation among the nodes. The edges are defined and prepared so that nodes in the edges do not have dependency relation with nodes outside the edges, however, all nodes in the edges are not connected under the dependency relation.
Context analysis by this method will now be described. In this method, the edge connecting adjacent nodes (words) is prepared first as an initial chart (see lines 1-4 of the algorithm). More precisely, the first step is selecting a pair of adjacent words (see line 2 of the algorithm). And then, an edge (i.e. simply grouped two nodes) are added to the chart by executing line 4 of the algorithm with the link type selected to "NONE" so as to be effective onto the pair of words. Then, nodes are grouped to have the dependency relation if the dependency relation between adjacent two words is established and the dependency relation is added to the chart as the edge between the nodes. This action is done by executing line 4 of the algorithm with the link type selected to ".rarw..times.M" or ".fwdarw..times.M".
After the initial chart is thus prepared, adjacent two edges are grouped by executing lines 5-16. This grouping action is repeatedly done by bottom-up method. Hereinafter, the left edge is referred to as an edge a, and the right edge is referred to as an edge b. A right end node (word) of the edge a and a left end node (word) of the edge b are the same words (common node), and those are combined with each other in the grouped edge.
In the common node, the dependency relation defined by the edge a and edge b is checked whether it has contradiction or not (line 11). Further, the common node is checked whether it has only one parent node (line 12). After it is discriminated that the dependency relation does not have any contradiction and the common node has only one parent node, a new edge c is prepared. The newly prepared edge c includes nodes from a left end node of the edge a to a right end node of the edge b. The sum-set of the dependency relations owned by the edge a and edge b is given to the newly prepared edge as its dependency relation (line 13). If another dependency relation between a left end node (a left end node of the edge a) and a right end node (a right end node of the edge b) is established, a new edge having thus established dependency relation is prepared and registered in the chart (line 16).
Those actions are repeated by the bottom-up method, and a full analysis result is obtained. In this method, the most suitable result is extracted from the obtained result and output (lines 18 and 19).
FIGS. 36A to 361 are quoted from "FIG. 1" of Publication 3, and schematically show primal steps of analyzing a sentence "The plan of the government to raise income tax". FIG. 36A shows the dependency structure to be output. Each arrow shown in FIG. 36A represents the direction from a child node to a parent node. FIGS. 36B to 36E show how the dependency structure shown in FIG. 36A is expressed in the chart.
Judgements represented by "yes" or "no" as shown in FIGS. 36C to 36E indicate which local dependency structures are allowed as the edge. For example, FIG. 36C shows the structure having a head node "plan" on which "The" and "of" sandwiching the head node are depended is prohibited to be the edge in accordance with the algorithm. Therefore, the judgement of this structure is "no". On the other hand, the structure shown in FIG. 36D includes two connecting parts "of the government" and "to rise". Such structure is allowable as the edge. The structure shown in FIG. 36E is also allowable on the same basis. Since those structures are allowed to be the edge, those have the judgement "yes".
Analyzing steps by this conventional technique are shown in FIGS. 37A to 37D. FIG. 37A shows a step of grouping an edge in which the edge "of the government to raise" and the edge "plan of" are grouped together with using "of" as an intermediate node. In FIG. 37A, the nodes (words) in the right edge are divided into two connecting parts in accordance with the dependency relation. During the grouping action, an edge having a dependency relation having "plan" as a head word is prepared between "plan" and "raise". Other edges having no dependency relations are also prepared, however, those are ignored because the edge having the dependency relation will be a correct edge eventually.
In this method, as shown in FIG. 37B, an edge of "raise income tax ROOT" is prepared by grouping the edge "raise income tax" and the edge "tax ROOT" are grouped together while setting the node (word) "tax" as an intermediate point. During this grouping action, no dependency relation between "ROOT" and "raise" is established. "ROOT" is a special word which will be a head word of the whole sentence eventually. This word is automatically added by the analyzing system. A further edge "plan of the government to raise income tax ROOT" is prepared by grouping the edge "plan of the government to raise" and the edge "raise income tax ROOT". During this grouping action, a dependency relation from "plan" to "ROOT" is established. Other edges having no dependency relations are also prepared, however, those are ignored because those will not be correct edges at final stage.
In a final step of this method, an edge "the plan of the government to raise income tax ROOT" is prepared by grouping the edge "plan of the government to raise income tax ROOT" and the edge "the plan", as shown in FIG. 37C. Thus prepared edge is output as the final correct edge as shown in FIG. 37D.
This method described in Publication 3 has a problem that linguistically unnatural structure must be used as a unit of the local analysis result. Such linguistically unnatural structure are, for example, shown in non-connective local structure. For example, in the step of preparing the initial edge based on the adjacent words, there is no the dependency relations among the words contained in the initial edges, therefore, the dependency structure of the initial edge is non-connected structure. In the initial edge preparation step, the adjacent words are merely grouped. It is difficult to give the structural interpretation to such group which is not the structure. The edges grow by repeated grouping actions. Since adjacent words are generally grouped separately, non-connective relations remain. As a result, the edge having non-connective dependency structure such as "of the government to raise" shown in FIG. 36D is prepared. If such non-connective relation was not allowed, the method could not prepare the initial edge. Therefore, the non-connective relation is substantially allowed.
As described so far, artificial unit of the edge, such as the non-connective relation, is employed in this method. More precisely, such unit is counter to linguistic intuition. This fact reveals the problems of the method, such as difficulties in giving linguistic interpretation to the edge, and in doing various operations with a unit of the local analysis result such as comprehending the local analysis result or giving priority to the structure. For example, priority may be given to the local analysis result for pruning during analysis, however, it unable to intuitively discriminate whether the structural interpretation of a group having two dependency structures such as "of the government to raise" is correct or not. This noncommittal interpretation makes it difficult to define a rule for investigating whether the structure is proper or not.
Not only Publications 1 to 4 explained above, but also the following patent applications disclose a natural language processing technique:
Unexamined Japanese Patent Application KOKAI Publication No. H2-330970 (hereinafter referred to as Publication 5) discloses a natural language construction analyzing system. A feature of the natural language construction analyzing system disclosed in Publication 5 comprises an edge information retaining means for storing all analyzed edge information so that arbitrary edge information can be referred to by any edge at an arbitrary point in time. That is, by virtue of the presence of the edge information retaining means, the information on an edge which is not always at hand can be referred. This realizes improved linguistic processing which can handle, for example, a relational particle in Japanese which influences other words variously.
Japanese Patent No. 2,546,245 (hereinafter referred to as Publication 6) discloses a method of generating natural language sentences utilizing coactive relation between translation and concept in order to select suitable meaning for the translation of predicative concept. In this case, given meaning of a sentence to be generated is based on the dependency structure established between concepts. The technique disclosed in Japanese Patent No. 2,546,245 is one of the examples of context analyzing/generating method using a chart relating to the dependency grammar rule.
Examined Japanese Patent Application KOKOKU Publication No. H7-89353 (hereinafter referred to as Publication 7) discloses a natural language analyzer which shows priority of the edge, which is a result of analyzed context tree, as vector. The analyzer can describe prior knowledge naturally, because it shows the priority of the edge as vector. This feature makes the analyzer possible to manage prior knowledge coordination easily, to introduce new prior knowledge easily, and to prune branches correctly and significantly.
None of those Publications 5 to 7 discloses technique for restricting huge amount of calculation (an order of the 5th power of the number of words when a chart analysis is applied to the dependency grammar rule).