The present invention relates generally to databases, and more particularly to matching new customer records to existing customer records in a large business database.
A large business database often has duplications of the same customer records. The duplications are likely due to misspelling errors or because of multiple methods of entering the customer records into the database. These duplications result in several problems for the end-user. One problem is that a customer whose records have been duplicated may receive multiple mailings from the end-user. Another problem is that the end-user may not ever have consistent information about each customer. The customer information may be inconsistent because every time the customer record has to be updated, only one record is updated. There is no assurance that the most recently updated record will be revised, which results in inconsistent information. A third problem with duplicated records, is that the end-user is unable to determine how much business activity has been generated by a particular customer. retrieval systems. These library-style catalogue retrieval systems can search a large database of records to find matches that are similar to a query entered by an end-user. Typically, these library-style catalogue retrieval systems use phonetic-based algorithms to determine the closeness of names or addresses or word strings. A problem with these library-style catalogue retrieval systems is that they are only useful for searching through an existing customer database and are unable to compress a large customer database having multiple repetitions of customer records. Therefore, there is a need for a methodology that processes new customer records, checks the new records for poor quality, normalizes and validates the new records, and matches the new records to existing customer records in order to determine uniqueness. Normalizing, validating, and matching the customer records will allow an end-user to avoid wasted mailings, maintain consistent information about each customer, and determine how much business activity has been generated by a particular customer.