Nearly every organization acquires, processes, and stores highly sensitive records. Among these records is confidential information such as personally identifiable information (PII) and business secrets. There is a legal and consumer-wide requirement to encrypt sensitive data that is stored, for example, in relational databases. Organizations are expected to closely guard this private data and manage access appropriately.
Security of data is typically achieved through encryption of the data. Data encryption achieves confidentiality by translating information from its original form (plaintext) into an encoded, unintelligible form (ciphertext). An encryption scheme is a tuple of three algorithms: key generation (KeyGen), encryption, and decryption. KeyGen is a randomized key generation algorithm used to generate keys. A key is a piece of information that determines the functional output of a cryptographic algorithm. For encryption algorithms, the key specifies the transformation of plaintext into ciphertext, and vice versa for decryption algorithms. KeyGen takes as input a security parameter and outputs an encryption key. The security parameter specifies a security strength of the generated key (e.g., as a bit-length of the key). The encryption algorithm takes as input a message (e.g., in plaintext) and a key, and outputs an encrypted message (e.g., ciphertext). The decryption algorithm takes as input the encrypted message and a key, and outputs the original message (e.g., in plaintext). In the case of symmetric encryption, the same cryptographic key (e.g., symmetric key) is used for both the encryption of plaintext and the decryption of ciphertext.
However, significant work is required to add encryption features to systems (e.g., a legacy systems) that are not designed to handle it. In most cases, when encryption is added to legacy systems, technical challenges are presented in that database schemas would need to be changed and back-end validation rules reimplemented. In the case of a “Date” field, for example, an original “Date” field will be encrypted with a block cipher into a 128-bit randomized string, so the data type for the “Date” field will need to change from date to blob (e.g. sequence of bits).
The typical approach to encryption is to serialize plaintext data into an array of bytes, split the byte array into blocks as specified by the encryption algorithm (e.g., 8-byte blocks for DES and 3DES, 16-byte blocks for AES, etc.), and encrypt the blocks in a secure mode of operation (e.g., Cipher Block Chaining (CBC), Galois Counter Mode (GCM), etc.). Block ciphers such as Advanced Encryption Standard (AES) or Data Encryption Standard (DES) operate on fixed-length groups of bits (e.g., size 128-bit block).
For example, a block cipher encryption algorithm might take a 128-bit block of plaintext as input, and output a corresponding 128-bit block of ciphertext using a secret key. Therefore, if an original field is a 1-byte short message, the data type for the field will need to change into a 128-bit block. The resulting ciphertext would be of a binary blob data type whose format cannot be related back to plaintext. Therefore, the schema database must change, and in most cases, expand in size. This on one hand is a waste of storage space and on the other hand would require widespread changes to the legacy system to accommodate the new data type.
Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals will be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated or adjusted for clarity, illustration, and/or convenience.