Today information management has become a critical, if not essential, aspect to successfully conducting business. Customer Relationship Management (CRM) systems and other like applications are being used to track information relating to customers, employees, supply chain, payroll, purchase orders, and numerous other traceable items just to mention a few. An important function of data collection is processing reports, analyzing data organized by these reports, and taking action as a measure of continuously improving productivity, cost, quality, and overall operations of the enterprise. As data collection systems grow, the amount of information collected and the central processing resources required to process this information can be substantial.
It is quite common for enterprises to utilize an on-line system for collecting information in real-time, and an off-line system for extracting historical data to generate reports for analysis. During the transfer process from the on-line system to off-line system, it is common for anomalies to occur such as, for example, an error in database indexing, failed CPU processes due to limited disk space, memory allocation errors in the on-line and/or the off-line systems, communication errors between systems causing a severance in communication, and so on. These errors can in turn disrupt the information extraction process such that duplicate entries or records may result.
This a common problem experienced by large enterprises. To work around this issue, human analysts are utilized to scan historical information collected by the off-line system to remove duplicate entries. Without this function, the possibility stands that misleading reports might be provided to management, which in turn can have an adverse effect on business operations as decisions are made in reliance of the accuracy of such reports. Although the function of the analysts is very important, the scanning process is costly and often prone to human error.