Information Extraction (IE) is the operation of extracting structured information from unstructured, machine readable text. The ubiquity of text data in modern data sources has made IE a critical component in a wide range of applications; just by way of non-restrictive example, such applications can include brand management, customer relationship management, regulatory compliance, and life sciences. To develop an IE program (also referred to herein as an “extractor”), a common practice is to construct patterns, either manually or automatically, and use the patterns to extract information over (or regarding) an input text.
Generally, as dependency parsers have become faster and more reliable, deep syntactic information has gained popularity as the input basis for extraction patterns. While such patterns can be produced automatically by way of machine learning, it has been shown that hand crafted patterns often outperform machine learning alternatives. However, such crafting is generally very labor-intensive and requires developers to be sufficiently trained in natural language processing, and capable of understanding and reasoning about dependency trees. Accordingly, among other problems, this can set an unreasonably high bar for their general use.