CPC G06F 40/169 (2020.01) [G06F 40/186 (2020.01); G06F 40/279 (2020.01)] | 17 Claims |
1. A method of relation extraction, comprising:
step S1, determining a text to be annotated, a plurality of correct seeds and a plurality of error seeds, wherein each of entities in each of sentences of the text to be annotated has been marked by tags as a first entity or a second entity, and each of the plurality of correct seeds and the plurality of error seeds is an entity pair consisting of the first entity and the second entity;
step S2, traversing each of the sentences of the text to be annotated based on the plurality of correct seeds to generate at least one first template;
step S3, traversing each of the sentences of the text to be annotated based on the at least one first template to match at least one new seed;
step S4, evaluating the at least one new seed having been matched, wherein a qualified new seed is used as a correct seed;
step S5, replacing the correct seeds in step S2 with the correct seeds acquired in step S4, and repeating steps S2 to S4 until a selected condition is met;
step S6, outputting matched correct seeds and a classification relation between the first entity and the second entity in each of the matched correct seeds; wherein the step S2 comprises selecting the at least one first template based on the plurality of correct seeds and the plurality of error seeds; the step S3 comprises traversing each of the sentences of the text to be annotated based on a selected first template to match at least one new seed; and wherein the selecting the at least one first template based on the plurality of correct seeds and the plurality of error seeds comprises: matching entity pairs in the text to be annotated by using the at least one first template; determining whether the entity pairs matched by the at least one first template are the correct seeds or the error seeds based on the plurality of correct seeds and the plurality of error seeds; determining a number of correct seeds and a number of error seeds in the entity pairs matched by the at least one first template; calculating an evaluation index of each of the at least one first template based on the number of correct seeds and the number of error seeds in the entity pairs matched by the at least one first template; and selecting the at least one first template based on the evaluation index of each of the at least one first template; and
training a deep learning model to acquiring a relationship extraction model by using at least some of the sentences in the text having been annotated.
|