Data masking is the practice of obscuring source data while maintaining some identifying characteristics of the source data. Data masking may be applied to source data prior to testing processes or applications which require “actual” source data, in order to alleviate potential privacy concerns. Conventional techniques for data masking include substitution, shuffling, numeric variance, and cloaking.
Substitution involves replaces words, numbers, or character strings with a value of the same type, randomly selected from a list of values of the same type. For example, the given name “John” may be replaced with another given name “Greg”. Shuffling replaces words, numbers, or character strings with a value randomly selected from a different record in the same data column. This is similar to the substitution technique except the replacement value is selected from the actual data source instead of from an external list of values.
Numeric variance replaces numbers or dates with a value that is within a specified numeric difference or percentage difference. For example, a sales value “$6,150,239” may be replaced with “$6,148,705”, or the birthdate “Feb. 19, 1982” may be replaced with “Dec. 5, 1981”. Cloaking replaces a portion (or all) of a data value with a wildcard character. For example, the Social Security Number “481-56-1029” may be replaced with “***-**-1029”, or the credit card number “4081-2625-4900-7216” may be replaced with “XXXX-XXXX-XXXX-7216 ”.
The above techniques are often unsuitable to mask name data. Numeric variance only applies to numeric data, and cloaking does not simulate actual data. Substitution and shuffling fail to maintain certain characteristics or relevancy which can be inherent in some name data. This failure breaks the links of data integrity, and processes or applications are unable to thoroughly test all scenarios on thusly-masked data. If such masked data is used in a demonstration to a prospective customer, the masked data may appear odd or inappropriate, thereby discouraging the customer.