With increased business transactions occurring electronically, every year organizations are forced to retain a growing volume of sensitive data. The ease at which data can be collected automatically, stored in databases and queried efficiently over the Internet has paradoxically worsened the overall privacy situation. The privacy situation has raised numerous ethical and legal concerns.
Problems arising from private data falling into malicious hands may include identity theft, stalking on web and spam. Legislation like HIPAA and PIPEDA has now made it legally mandatory for all service providers in the United States of America to ensure privacy and security of data entrusted with the service providers. Violations of the legislation may attract a heavy penalty. Also, loss of trust from customers stands as a looming danger if such a security breach occurs.
Earlier data masking had to be performed by buying dedicated licenses and/or entire product suites. The licenses and/or the product suits come with a huge cache of data masking options. Not all of the data masking options may be needed by a customer for immediate requirement.
A worldwide movement towards data privacy legislation has increased pressure on organizations to improve their information privacy and security standards. Data privacy research indicates that more than 70% of all security incidents come from internal threats. Moreover, data breaches coming from inside and associated costs of such internal breaches are more than 50 times as costly when compared with external breaches.
Thus, there is a need to provide technological solutions to achieve privacy keeping a tradeoff between data privacy and data utility. Techniques are required for publishing data while preserving right balance between individual privacy and data utility. Some techniques for data privacy are Anonymization, Randomization, Perturbation, Privacy Policy Languages and Data Masking.
The process whereby information in a database shall be masked and/or ‘de-identified’ may be referred to as data masking. Data masking enables creation of realistic data in non-production environments. This avoids risk of exposing sensitive information to unauthorized users. Data masking ensures protection of the sensitive information from a multitude of threats posed both outside and inside the perimeter of an organization.
Among available variety of software tools, solutions and systems implementing data masking techniques, most of them have some major drawbacks.
Standalone utilities are tightly integrated with other processes. Regular updates and/or bug fixes become cumbersome. Moreover the existing products work only on specific formats. Thus, making them restrictive to a type of data and an underlying environment. For vendors, a standalone data masking product also presents piracy concerns, deployment concerns such as creation of installers and integration concerns.
Versioning issues are present in case of the standalone data masking product. A new version may not be compatible with existing OS and hardware. A new release also needs to be tested for integration with all possible services the standalone data masking product caters. Manufacturing and distribution cost are also present in case of the standalone data masking product.
Existing products may be OS/platform/language dependent and non-reversible while changing original look and feel of the data, thus making usage limited for a particular environment.