A database, such as a relational database, an object-oriented database, or another type of data management system, may be used for the administration of data processed by a computer system running one or more application programs or systems. Examples of application programs or systems include an enterprise resource management system, a customer relationship management system, a human resources management system, a supply chain management system, and a financial management system.
Identical records may exist in more than one data management system. Some data in one or more data management systems may be incorrect because of inconsistencies in records that should be identical in two or more data management systems. Data may be inconsistent, for example, when a record is missing from a data management system in which the record should reside or when a record includes incorrect values.
Copying all of the necessary records from one data management system to a second data management system may be an impractical method to correct inconsistent data in some cases, such as when the time required to copy and load a large volume of data into a data management system is disruptive to the operation of the data management system. An alternative to copying all of the records is to detect and correct the inconsistent data.
Inconsistent data may be detected by comparing records stored in two data management systems to identify records that occur in one data management system and do not occur in the other data management system. One method of comparing records to identify duplicate records in a single database includes sorting records by a field, such as a key or identifier field, that may be used to identify similar or matching records. The field values of two records then are compared to determine whether the field values match. If so, the records may be identified as duplicates of one another.