In natural language, it is not uncommon to refer to entities by different descriptions. For example, pronouns are commonly used to take the place of nouns. Also, various other descriptions, or different forms of a reference, may be used to refer to an entity. Considering the following portions of text as an example:                “Pablo Picasso was born in Malaga.”        “The Spanish painter became famous for his varied styles.”        “Among his paintings is the large-scale Guernica.”        “He painted this disturbing masterpiece during the Spanish Civil War.”        “Picasso died in 1973.”        
A range of linguistic variation is encountered. For example, two different names are used, “Pablo Picasso” and “Picasso.” A definite description, “the Spanish painter,” and two pronouns “his” and “he” are all used to refer to Picasso. Two different expressions are used to refer to a painting: the name of the piece, “Guernica” and a demonstrative description, “this disturbing masterpiece.”
Two linguistic expressions may be said to be coreferential if they have the same referent. In other words, if they refer to the same entity. A second phrase can be an anaphor which is anaphoric to a first phrase. As such, the first phrase is the antecedent of the second phrase. Knowledge of the referent of the antecedent may be necessary to determine the referent of the anaphor. The general task of finding coreferential expressions, anaphors, and their antecedents within a document can be referred to as coreference resolution. Coreference resolution is the process of establishing that two expressions refer to the same referent, without necessarily establishing what that referent is. Reference resolution is the process of establishing what the referent is.
For clusters of expressions that are coreferential, irrespective of their anaphoric relationships, the expressions can be referred to as aliases of one another other. According to the example above, the expressions “Pablo Picasso,” “the Spanish painter,” “his,” “he,” and “Picasso” form an alias cluster referring to Picasso.
Natural language expressions often display ambiguity. Ambiguity occurs when an expression can be interpreted with more then one meaning. For example, the sentence “The duck is ready to eat” can be interpreted as asserting either that the duck is properly cooked or that the duck is hungry and needs to be fed.
Coreference resolution and ambiguity resolution are two examples of natural language processing operations that can be used to mechanically support language as commonly expressed by human users. Information processing systems, such as text indexing and querying in support of information searching, may benefit from increased application of natural language processing systems.
It is with respect to these considerations and others that the disclosure made herein is presented.