Ambiguity poses a serious challenge to the organization of information. For example, collecting information related to a particular entity is complicated by the existence of other entities with the same name. Overloading of entity names is common, whether the entity is a person (“Michael Jackson”), a place (“Paris”), or even a concept (“garbage collection”).
It is frequently useful to know the specific entity to which a document is referring. For example, if the goal is to extract, organize, and summarize information about Michael Jackson (the singer), one will want to look only at documents about Michael Jackson (the singer), and not at documents other Michael Jacksons. The ambiguity of language, of names, and of other common properties makes determining which entity a document is referring to a difficult task. Therefore, what is needed is a method for disambiguating references to entities in a document.