In high volume transaction computer systems, when a database is being filled with records, it is often difficult to check for duplicates. This can be due to the fact that when the tables are generated, it is common for table names to be unknown up front, as well as at least some of the fields to be unknown up front. Additionally, the structure of some tables may be dynamic, making it even more difficult to identify duplicates during database loading. Furthermore, a number of records in the tables require rapid deletion, which make it difficult to implement a duplicate check with build-in database methods such as uniqueness constraints for database indexes. As soon as the data is deleted (due to legal compliance or in order to save disk space), the duplicate check can not operate on only the database level.
One area in which this problem is extremely prevalent is in enterprise resource planning (ERP) systems. ERP systems allow for the integration of internal and external management information across an entire organization, including financial/accounting, manufacturing, sales and service, customer relationship management, and the like. The purpose of ERP is to facilitate the flow of information between business functions inside the organization and manage connections to outside entities. Convergent Invoicing (CI) integrates ERP and Customer Relationship Management (CRM) systems so that organizations with complex billing processes can create, change, and cancel billable accounts for customers, as well as retrieve and view invoicing data for services rendered on demand. These systems often have very high transaction counts (e.g., 100 million transactions per day), making it difficult to prevent duplicate transactions from occurring, and yet the repercussions from duplicates can be formidable, with invoices with duplicate entries being sent to customers resulting in overcharging and/or a perception of a lack of quality control.