Databases containing sensitive information on systems accessible from the internet or on electronic media have proliferated globally. Protection of sensitive data within a given domain has traditionally been managed by controlling access to the data. This approach is flawed, as demonstrated by many widely publicized incidents when an attacker gains access to the internal system or when the data is moved outside the enclosure, for example, when data on a laptop or disk is stolen. There have been numerous documented events of computer break-ins that compromise sensitive data such as credit card numbers, personal identification and social security numbers, financing and loan information, and medical information.
One way to protect this sensitive data is to encrypt it. But this sensitive data, contained in databases or other persistent mechanisms, is served by processes that make assumptions about the format of various data items, for example credit card numbers and social security numbers that are strings of decimal digits in a certain format, dollar amounts in a certain range, alphabetic strings, dates, and zip codes. In addition, different copies of the data can reside in multiple locations and a given process may require that the data match in these different locations for the process to be performed. Because it is not feasible to revise all existing processes which use the data, it is necessary that any data protection method, for example, an encryption method used to encrypt data contained in a database or other persistent mechanism, must be executed in a way that preserves the format of the data sufficiently such that an existing process using the data will still function and any validity and cross-system checks performed by the process can be performed and passed.