Relation extraction in sentences can be performed based on a pipeline of two separate subtasks: Entity Recognition (ER) and Relation Classification (RC). First, named entities (e.g. “John Doe”-->PERSON) are detected and then a relation classification on the detected entities mentioned (e.g. “work_for”) is performed. Relation classification is defined as the task of predicting the semantic relation between the annotated pairs of nominals (also known as relation arguments). These annotations, for example named entity pairs participating in a relation, are often difficult to obtain.
Relation classification is treated as a sentence-level multi-class classification problem, which often assumes a single relation instance in the sentence. Further, it is conventionally assumed that entity recognition affects the relation classification, but is not affected by relation classification.
For example, deep learning methods such as recurrent and convolutional neural networks (Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao, 2014, “Relation classification via convolutional deep neural network” in “Proceedings of COLING”; Dongxu Zhang and Dong Wang, 2015, “Relation classification via recurrent neural network” in ArXiv; Thien Huu Nguyen and Ralph Grishman, 2015, “Relation extraction: Perspective from convolutional neural networks” in “Proceedings of the NAACL Workshop on Vector Space Modeling for NLP”) treat relation classification as a sentence-level multi-class classification, and rely on the relation arguments provided in the sentence. Therefore, these methods are incapable of handling multiple relation instances in a sentence and cannot detect corresponding entity mention pairs participating in the relation detected.
Accordingly, it is one aspect of the invention to improve both entity and relation extraction in natural language processing.