Enterprise database systems store vast amounts of data received from one or more different sources. This data is typically stored in the form of relational database table records. Since the data may be received from different sources and/or at different times, a database table may include several records representing a same object. For example, a table storing personal addresses may include several records associated with a same person, depending upon the source or reception time of the address information stored therein. Such records may be considered duplicate records.
Once duplicate records are identified, it may be desirable to consolidate the duplicate records into a single master record. This requires selection of a master record of the duplicate records, into which the “best” column values identified from the duplicate records will be stored.
Some systems support rules for selecting a master record in which priorities for input sources are configured and one or more priority fields are chained together. In other systems, the first record of each group of duplicate records is simply chosen as the master record. Both approaches present difficulties and inefficiencies which hamper the selection of a master record which is appropriate in view of a user's needs.