1. Field of the Invention
Embodiments of the invention generally relate to processing identity records in an entity resolution system, and more particularly, to adding entities to a group of entity resolution candidates.
2. Description of the Related Art
In an entity resolution system, identity records are loaded and resolved against known identities to derive a disambiguated entity repository. An “entity” generally refers to an organizational unit used to store identity records that are resolved at a “zero-degree relationship.” That is, each identity record associated with a given entity is believed to describe the same person, place, or thing. Thus, one entity may reference multiple individual identities. This is frequently benign, e.g., in a case where an entity includes two identities, a first with identity records identifying a woman based on a familial surname and a second identity with records identifying the same woman based on a married surname. Of course, in other cases, multiple identities may be an indication of mischief or a problem, e.g., in a case where one individual is impersonating another, using a fictitious identify, or engaging in some form of identify theft.
In entity resolution systems, a single entity may have multiple attribute values for the same attribute type. Frequently, this may result from multiple records being provided that include a value for a given attribute. For example, an entity may have multiple addresses, phone numbers, driver's license numbers, names, etc. In some cases, different values for an attribute may be appropriate (e.g., when a person changes telephone numbers or moves from one place to another). Multiple attribute values may also exist due to the variety of systems from which identity records are drawn. Moreover, different record systems may introduce typos, transpose characters, make system-specific alterations, such as truncating an address, or simply format the same information differently.
One task performed by an entity resolution system is to resolve incoming identity records against known identities. In other words, when a new identity record is received, the entity resolution system may determine if the new identity record refers to a known entity. If so, then the new identity record may be associated with that entity. If not, then a new entity may be created for the new identity record.