Insurance providers have a need to analyze large databases of insurance account data to develop tools and services that accompany insurance products. Many types of insurance account data is confidential (e.g. social security numbers) and cannot be shared easily with third parties. For example, an insurance company may ask a developer to create a program that presents an easy-to-view dashboard of important customer account information for when a customer calls the insurance company. The insurance company may be unable to share the confidential insurance account data with the developer to assist in the program's development. The developer needs a test database in the same format as a real customer database to accurately troubleshoot issues that arise during development. Hence, there is a need to be able to scrub a database of confidential insurance account information while preserving the format of the data.
Past systems that perform this task are slow at handling the volume of records a typical insurance company stores. Further, these past systems cannot guarantee that the output of the scrubbing tool provides a unique output for each data field scrubbed. Many types of confidential insurance account data are unique to the particular customer and duplicate results can cause unexpected problems when developing tools that interact with confidential insurance account data.
Accordingly, there is an opportunity to create a scrubbing algorithm that can quickly scrub a database of confidential information and ensure that the resulting database contains no duplicate entries.