Some embodiments relate to anonymizing personally identifiable information. In particular, but not by way of limitation, some embodiments relate to systems and methods for substituting aliased information for personally identifiable information.
Personally identifiable information (“PII”) is used in many areas, including marketing and government analysis. In some instances it is desirable for PII to be anonymized. Currently, the available solutions suffer from significant shortfalls.
One known option is to simply not use the PII data. An evident shortfall of this option is that the data is not available for use. This option can have serious repercussions because in many cases, the information is not available in any other form. Despite the repercussions, however, this option is often taken to ensure protection of PII.
A second known option is to redact enough PII to ensure that a subject cannot be identified. Redaction involves removing significant portions of the information, which is then no longer available for analysis or use. While the data can be analyzed, many other useful functions cannot be performed. For example, a user analyzing data that has redacted name information cannot identify potentially significant patterns because the name information is completely unavailable. If the same name would have appeared in eight different places, no way exists for the analyst to recognize that pattern. An additional shortfall of redaction is that the PII cannot be retrieved. In redaction, once the information is redacted it becomes irretrievable.
A third known option is to encrypt the PII. With encrypted PII, analysis can be performed, patterns can be identified, and PII can be retrieved. An issue with identifying patterns is that encrypted data looks unrecognizable to a human. For example, the name “John Smith” may be encrypted into “S6!FG09Q.” It is difficult for a human to recognize patterns when the patterns are random sequences of characters. Also, encryption can be broken, and it is particularly easy to decrypt short pieces of information. For instance, PII that is only 4 characters (e.g., the last four digits of a telephone number) cannot be securely encrypted. With a typical hashing system, a 4 character value can be relatively easily decrypted.
Although present devices are functional, they are not sufficiently accurate or otherwise satisfactory. Accordingly, a system and method are needed to address the shortfalls of the present technology and to provide other new and innovative features.