Information extraction (i.e., the process of deriving structured information from unstructured text) is an important aspect of many enterprise applications, including semantic search, business intelligence over unstructured data, and data mashups. Systems that perform information extraction for enterprise applications use a set of information extraction rules to define the types of patterns to be identified in the text. An information extraction system expresses the rules in a rule language, such as Java Annotation Patterns Engine (JAPE), Annotation Query Language (AQL), or XLog. JAPE is a component of the open-source General Architecture for Text Engineering (GATE) platform. AQL is an annotation rule language that specifies rules for SystemT, an information extraction system developed by International Business Machines Corporation located in Armonk, N.Y. XLog is a variant of Datalog with embedded procedural predicates. For example, a system to identify person names in unstructured text may include a number of information extraction rules, such as the following rule, which is expressed in English for clarity: If a match of a dictionary of common first names occurs in the text, followed immediately by a capitalized word, mark the two words as a “candidate person name.” The information extraction rules are used in information extraction systems to feed structured information directly into important business processes or used as the feature extraction stage of various machine learning algorithms. Since the downstream processing relative to information extraction tends to be highly sensitive to the quality of the results that the information extraction rules produce, it is important for the extracted information to have very high precision and recall (i.e., the rules produce very few false positive and false negative results). Developing a highly accurate set of information extraction rules with known techniques requires substantial skill and considerable effort. Standard practice is for the developer to go through a complex iterative process: (1) build an initial set of rules; (2) run the rules over a set of test documents and identify incorrect results; (3) examine the rules and determine refinements that can be made to the rule sets to remove incorrect results; and (4) repeat the process. Of these steps, the manual task of identifying rule refinements is by far the most time-consuming. An extractor may have a significant number of rules (e.g., hundreds of rules), and the interactions between these rules may be very complex. When changing rules to remove a given incorrect result, a rule developer must be careful to minimize the effects on existing correct results. The manual work required to identify possible changes for a single false positive result and minimize the effects on existing correct results can take a significant amount of time. Thus, there exists a need to overcome at least one of the preceding deficiencies and limitations of the related art.