Systems exist for collecting information describing characteristics or behavior of separate individuals. Collecting such personal information has many applications, including in national security, law enforcement, marketing, and other fields. An action or transaction may generate data records specific to that action and the individual who performed it. For example, the major credit bureaus maintain and lawfully sell access to databases of personal financial data records for nearly every individual with a line of credit, a credit card, auto loan, mortgage, etc. in the United States. As another example, databases with information describing mortgage information are also lawfully available.
As technology advances, an ever increasing amount of personal data is becoming digitized, and as a result, more and more personal data is becoming lawfully accessible. The increased accessibility of personal data has spawned new industries focused on lawfully mining personal data.
A personal data record may include a number of categories. A data record representing an individual mortgage may include categories such as the name of the individual, his or her city, state, and ZIP code, the individual's employer, the name of the mortgage provider, the interest rate, and the amount of the loan. Data records from different sources may comprise different categories.
Databases of personal data records may contain distinct records corresponding to the same individual. For example, an individual may have multiple mortgages over the course of a lifetime. Other types of lawfully available databases may maintain a single data record for an individual or social security number. Such records may be updated periodically or as events occur that affect the accuracy of an individual's data record.
For this reason, correlating, or linking, different data records describing the same person can be challenging because contact information for the same individual can change over time. As records receive more updates from different sources, they also have a greater risk of inconsistency and errors associated with data entry. In these ways, data records all describing the same individual can be incongruous, inconsistent, and erroneous in their content.
To link incongruous data records, improved methods and systems are needed.