Companies store and manage data in a wide variety of data sources, such as information systems and databases. For example, a company may store and manage customer related data in a customer relationship management (CRM) system used by multiple employees, such as sales representatives. Many data quality issues arise in such information systems and databases from human data entry errors or lack of diligence in entering or managing the data. In one case, a sales representative may enter incomplete information regarding contact information for a customer. For example, the sales representative may not include the zip code in the customer's address. In another case, another sales representative may create duplicate data by creating a second record for the same customer entered by the first sales representative. For example, the sales representative may enter a nickname or different name for the company in the CRM system so that the company appears to be different than the customer record entered by another sales representative. In other cases, a sales representative may incorrectly spell a company name, a person's name or other portions of the contact information. In yet another case, a sales representative may enter the wrong address or wrong telephone number for a customer. As a result of the various ways people enter data, data sources may have a multitude of inaccurate, incomplete or duplicate data to manage and correct.
Data quality issues, such as inaccurate, incomplete, and duplicate data can be costly in time, expense, and resources. For example, a sales representative may spend hours searching through duplicate records to find the current notes and account information of a customer. As such, data quality may result in loss productivity of employees. In another example, marketing may not be able to accurately target customers and prospects because of outdated and incorrect contact information. For example, a marketing campaign may have a high number of email or mail returns from incorrect address information, a high number of duplicate mailing, or marketing information sent to an incorrect person. Thus, data quality issues may result in extra expense and missed sales, customer relationship and goodwill opportunities. In some cases, information technology (IT) personnel may be pulled away from important business matters to address the data quality issues in an information system. For example, an IT resource may be used to fix or merge duplicate records, and may work on such issues one at a time as they are reported. As a result, data quality issues may result in less efficient use of company resources and time.
Additionally, companies rely on corporate data to make many business decisions. Various company reports may be created from one or more data sources having quality issues. For example, a company may generate a report that aggregates errors and duplicates while rolling up the data into a report view. In some cases, the company must spend time, expense and resources to verify and correct the quality data issues to have an accurate report. In other cases, the company report may have quality data issues that are unknown or net yet detected. As such, the company may make business decisions on less than accurate information. The consumption of cost, time and resources related to data qualities issues are further compounded by the multiple data sources and information systems companies use, maintain and rely on for business decisions. For example, companies may have quality data issues in their accounting systems, manufacturing systems, product data and lifecycle management systems, supply chain management systems, and enterprise resource planning systems.
Therefore, systems and methods are desired to more efficiently address and improve the quality of data in data sources, such as information systems and databases.