1. Field of the Invention
The present invention relates to a computer program product, system, and method for using vertex self-information scores for vertices in an entity graph to determine whether to perform entity resolution on the vertices in the entity graph.
2. Description of the Related Art
Entity resolution refers to techniques to determine whether different records with different data in a database that have unique identifiers may in fact comprise the same real world entity. To compare data records in a database to determine a relationship value of the records, the database server may have to pair wise compare each possible pair of records. An entity graph may then be formed where records that are determined to have a relationship value satisfying a threshold are shown as vertices linked by an edge indicating the relationship among the entities. The resulting entity graph may have vertices indirectly linked along edges. The entity graph may be used to perform entity resolution to determine if two vertices representing different records are in fact the same entity. For instance, if two records are determined to be related, then they may be updated to indicate the same entity. Various other techniques may be used to determine entity relationship using the graph.
There is a need in the art for improved techniques to perform entity resolution on an entity graph.