1. Field of the Invention
The field of the invention is data processing, or, more specifically, methods, apparatus, and products for natural language processing (‘NLP’).
2. Description of Related Art
Technologies configured to derive knowledge from unstructured data and use that knowledge to advise on matters spanning a broad set of use cases from a number of different industries are becoming more common. These technologies generally rely on ingesting a large amount of unstructured data, the ability to gain knowledge and understanding from that data, and the subsequent use of that knowledge and understanding to provide an answer with a level of confidence to various questions over that evidence source domain. Such technologies generally operate by generating a large number of hypotheses (candidate answers for a particular question) and then scoring the likelihood that each hypothesis is a correct answer using a variety of natural language processing techniques. Techniques for scoring candidate answers are typically an assortment of natural language processing algorithms that take into account various parameters, such as, how well the terms found in an answer match those in the question, whether the candidate answer is expressed in the same logical form as the question, whether the candidate answer is of the same lexical answer type as that expected by the question, and a number of other techniques that involve analysis of unstructured text. These techniques work well when finding answers to questions based on alignment of a set of facts, but are less effective in dealing with unstructured data which is very criteria oriented. An example of such unstructured data that is criteria oriented includes a set of guidelines used to approve or deny reimbursement from a medical insurance company to an insured patient for a specific medical procedure. This type of unstructured data typically contains a number of conjunctions indicating that either a list of conditions must all be true for a given guideline to apply or that any member of a list of conditions is sufficient for the guidelines to apply. That is, this type of unstructured data includes various logical operates that join various condition or criteria into a single test or criterion.