The need for effective and efficient data encryption/decryption is widespread throughout today's world. Whether it be data maintained by a governmental agency that pertains to national security or data maintained by a private company that pertains to the company's trade secrets and/or confidential information, the importance of effective and efficient encryption/decryption cannot be understated.
Effective encryption/decryption is needed to preserve the integrity of the subject data. Efficient encryption/decryption is needed to prevent the act of encrypting/decrypting the subject data from becoming an overwhelming burden on the party that maintains the subject data. These needs exist in connection with both “data at rest” (e.g., data stored in nonvolatile memory) and “data in flight” (e.g., data in transit from one point to another such as packet data transmitted over the Internet).
A number of data encryption/decryption techniques are known in the art. Many of these encryption techniques utilize a block cipher (see, e.g., block cipher 100 in FIG. 1). A block cipher is a cryptographic mechanism that operates on fixed length blocks of plaintext and produces fixed length blocks of ciphertext (see, e.g., blocks 108, 110 and 112 in FIG. 1). Plaintext refers to data needing encryption and ciphertext refers to data that has been encrypted. A block cipher encrypts each plaintext block using a key as per well-known key-based encryption algorithms (see, e.g., key 114 in FIG. 1). The key is typically (but need not be) the same size as the plaintext block. Using different keys to encrypt the same block of plaintext typically (but need not) produces different blocks of ciphertext. Block ciphers 100 can operate on data blocks of varying sizes, with typical data block sizes ranging between 64 bits and 512 bits. For example, the Advanced Encryption Standard (AES) block cipher operates on blocks of 128 bits (16 bytes). Encrypting large segments of plaintext requires a mode of encryption operation that defines the flow of a sequence of plaintext data blocks through one or more block ciphers. Likewise, decrypting large segments of ciphertext requires a mode of decryption operation that defines the flow of a sequence of ciphertext data blocks through one or more block ciphers.
As an example of one such known mode of encryption/decryption, the electronic codebook (ECB) mode of encryption/decryption is commonly used due to its simplicity and high data throughput. Examples of the ECB mode of encryption/decryption are shown in FIG. 1. With the ECB mode, a data segment needing encryption is divided into a plurality of data blocks, each data block comprising a plurality of data bits (see data blocks 102, 104 and 106 in FIG. 1). Each block cipher 100 then encrypts each data block independently using key 114. At time t=t0, plaintext data block 102 is encrypted by the block cipher 100 using key 114 to produce ciphertext data block 108. Subsequently, at time t=t1, plaintext data block 104 is encrypted by the block cipher 100 using key 114 to produce ciphertext data block 110. Then, at time t=t2, plaintext data block 106 is encrypted by the block cipher 100 using key 114 to produce ciphertext data block 112. To later decrypt the ciphertext data blocks 108, 110 and 112, these steps can then be repeated to reconstruct the original plaintext data blocks 102, 104, and 106. It is worth noting that the same block cipher 100 can be used to both encrypt and decrypt data using a key.
With ECB, the lack of sequential blockwise dependency in the encryption/decryption (i.e., feedback loops where the encryption of a given plaintext block depends on the result of encryption of a previous plaintext data block) allows implementations of the ECB mode to achieve high data throughput via pipelining and parallel processing techniques. While ECB exhibits these favorable performance characteristics, the security of ECB's encryption is susceptible to penetration because of the propagation of inter-segment and intra-segment uniformity in the plaintext to the ciphertext blocks.
For example, a 256 bit segment of plaintext containing all zeros that is to be encrypted with a 64 bit block cipher using ECB will be broken down into 4 64-bit blocks of plaintext, each 64-bit plaintext block containing all zeros. When operating on these plaintext blocks, ECB will produce a segment of ciphertext containing four identical blocks. This is an example of intra-segment uniformity. Furthermore, if another such 256-bit all zero segment is encrypted by ECB using the same key, then both of the resulting ciphertext segments will be identical. This is an example of inter-segment uniformity. In instances where intra-segment and/or inter-segment uniformity is propagated through to ciphertext, the security of the ciphertext can be compromised because the ciphertext will still preserve some aspects of the plaintext's structure. This can be a particularly acute problem for applications such as image encryption.
To address intra-segment and inter-segment uniformity issues, there are two commonly-used approaches. One approach is known as cipher block chaining (CBC). An example of the CBC mode of encryption/decryption is shown in FIG. 2. The CBC mode combines the most recent ciphertext output from the block cipher with the next input block of plaintext. The first plaintext block to be encrypted is combined with an initialization vector that is a bit string whose bits have random values, thereby providing the CBC mode with inter-segment randomness.
As shown in FIG. 2, At time t=t0, the first plaintext data block 102 is combined with a random initialization vector (IV) 200 using a reversible combinatorial operation 210, to thereby create a block-vector combination. This block-vector combination is then encrypted by block cipher 100 using key 114 to thereby generate ciphertext block 202. Next, at time t=t1, the ciphertext block 202 is fed back to be combined with the second plaintext block 104 via XOR operation 210. The resultant block-vector combination is key encrypted by block cipher 100 to produce ciphertext block 204, which is in turn fed back for combination with the next plaintext block at time t=t2 to eventually produce ciphertext block 206. Thus, as can be seen, when the CBC mode is used to encrypt a data segment comprising a plurality of data blocks, the bit vectors that are used for the reversible combinatorial operations with the plaintext data blocks that follow the first plaintext data block are bit vectors that are dependent upon the encryption operation(s) performed on each previously encrypted plaintext data block.
Preferably, the reversible combinatorial operation 210 is an XOR operation performed between the bits of the vector 200 and the block 102. The truth table for an XOR operation between bits X and Y to produce output Z is as follows:
XYZ000011101110As is well known, the XOR operation is reversible in that either of the inputs X or Y can be reconstructed by performing an XOR operation between Z and the other of the inputs X or Y. That is, if one XORs X with Y, the result will be Z. If one thereafter XORs Z with Y, then X will be reconstructed. Similarly, if one thereafter XORs Z with X, then Y will be reconstructed.
Thus, on the decryption side, the CBC mode operates to decrypt ciphertext block 202 with the cipher block 100 using key 114 to thereby reconstruct the XOR combination of plaintext data block 102 and the initialization vector 200. Thereafter, this reconstructed combination can be XORed with the initialization vector 200 to reconstruct plaintext block 102. Next, at time t=t1, the process is repeated for the next ciphertext block 204, although this time the XOR operation will be performed using ciphertext block 202 (rather than initialization vector 200) to reconstruct plaintext data block 104. Ciphertext block 202 is used in this XOR operation because it was ciphertext block 202 that was used in the XOR operation when plaintext block 104 was encrypted. Then, once again this process is repeated at time t=t2, albeit with ciphertext block 204 being used for the XOR combination operation with the output from cipher block 100.
While the use of feedback by the CBC mode addresses the issue of inter-segment and intra-segment uniformity, such feedback imposes a sequential processing flow on the encryption that significantly limits the achievable throughput of the encryption engine. As such, the CBC mode cannot make ready use of pipelining because one of the inputs for the reversible combinatorial operation stage 210 of the encryption for a given data block depends upon the output of the cipher block stage 100 of the encryption performed on the previous data block. That is, because of the feedback, the reversible combinatorial operation stage in a CBC encryption engine must wait for the block cipher to complete its encryption of a given data block-bit vector combination before it can begin to process the next data block.
Furthermore, on the decryption side, the CBC mode's dependence on the sequential order of data block encryption can raise problems when one wants to retrieve only a portion of the encrypted data segment. For example, for a data segment that comprises data blocks DB1 through DB20, when that data segment is encrypted and stored for subsequent retrieval in its encrypted form, an instance may arise where there is a need to retrieve data blocks DB6 through DB10, wherein the other data blocks of the data segment are not needed. However, to be able to successfully decrypt data blocks DB6 through DB10, the retrieval operation and decryption operation will nevertheless need to operate on data blocks DB1 through DB5 so that decryption can be performed for data blocks DB6 through DB10.
Furthermore, when used for disk encryption, the CBC mode may be vulnerable to a “watermark attack” if the initialization vector 200 is not kept secret (such as may be the case when the initialization vector is derived from a quantity such as a disk volume number). With such an attack, an adversary can determine from the output ciphertext whether or not a specially crafted file is stored. While there are solutions to such an attack (such as using hashing to derive the initialization vector from the data blocks in the sector), these solutions add to the computational complexity of the encryption operation and thus further degrade the throughput and/or increase the computational resources required for the encryption.
A second approach is known as the Segmented Integer Counter (SIC) mode, or more succinctly the counter (CTR) mode. FIG. 3 depicts an example of the SIC/CTR mode of encryption/decryption. The SIC/CTR mode key encrypts a block comprising a combination of a random value (or nonce) and a counter value. This random value-counter combination can be achieved in any of a variety of ways (e.g., concatenation, XOR, etc.) The counter values may be any sequence of values that do not repeat over a long duration, but a simple incremental counter is believed to be the most commonly-used approach. The output of the block cipher 100 is then combined with the plaintext block using a reversible combinatorial operation 210 (e.g., XOR), with the output of the operation 210 being the ciphertext block. The SIC/CTR mode belongs to the general class of encryption modes known as a stream cipher.
As shown in FIG. 3, at time t=t0, the random value 300 is combined with a counter value 308 in some manner to create a random value-counter combination block 302. This block 302 is then encrypted by block cipher 100 using key 114, and the output therefrom is then XORed with plaintext block 102 to generate ciphertext block 322. Next, at time t=t1, the random value 300 is combined with a next counter value 310 in some manner to create the random value-counter combination block 304. This block 304 is then encrypted by block cipher 100 using key 114, and the output therefrom is then XORed with plaintext block 104 to generate ciphertext block 324. Finally, at time t=t2, the random value 300 is combined with a next counter value 312 in some manner to create the random value-counter combination block 306. This block 306 is then encrypted by block cipher 100 using key 114, and the output therefrom is then XORed with plaintext block 106 to generate ciphertext block 326.
On the decryption side, this process can then be reversed where the combination blocks 302, 304 and 306 are decrypted by block cipher 100 using key 114, with the respective outputs therefrom being XORed with the ciphertext blocks 322, 324 and 326 respectively to reconstruct plaintext blocks 102, 104 and 106.
The SIC/CTR mode of encryption/decryption also suffers from a security issue if data segments are always encrypted with the same random value 300. If an adversary is able to gather several versions of the encrypted data segment, it would be possible to derive information about the plaintext because the cipher text (C) is simply the XOR of the variable (V) based on the random number and the plaintext (P), e.g., C=P⊕V, thus C⊕C′=P⊕P′.
Therefore, the inventors herein believe that a need exists in the art for a robust encryption/decryption technique that is capable of reducing both inter-segment and intra-segment uniformity while still retaining high throughput and exhibiting blockwise independence. As used herein, an encryption operation for a data segment is said to be “blockwise independent” when the encryption operations for each data block of that data segment do not rely on the encryption operation for any of the other data blocks in that data segment. Likewise, a decryption operation for a data segment is said to be “blockwise independent” when the decryption operations for each encrypted data block of that data segment do not rely on the decryption operation for any of the other data blocks in that data segment.
Toward this end, in one embodiment, the inventors herein disclose a technique for encryption wherein prior to key encryption, the plaintext data block is combined with a blockwise independent bit vector using a reversible combinatorial operation to thereby create a plaintext block-vector combination. This plaintext block-vector combination is then key encrypted to generate a ciphertext block. This process is repeated for all data blocks of a data segment needing encryption. For decryption of the cipher text blocks produced by such encryption, the inventors herein further disclose an embodiment wherein each ciphertext data block is key decrypted to reconstruct each plaintext block-vector combination. These reconstructed plaintext block-vector combinations can then be combined (using the reversible combinatorial operation) with the corresponding randomized bit vectors that were used for encryption to thereby reconstruct the plaintext blocks.
As an improvement relative to the CBC mode of encryption/decryption, each bit vector is blockwise independent. A bit vector is said to be blockwise independent when the value of that bit vector does not depend on any results of an encryption/decryption operation that was performed on a different data block of the data segment. Because of this blockwise independence, this embodiment is amenable to implementations that take advantage of the power of pipelined processing and/or parallel processing.
Moreover, because of the blockwise independent nature of the encryption performed by the present invention, a subset of the encrypted data segment can be decrypted without requiring decryption of the entire data segment (or at least without requiring decryption of the encrypted data blocks of the data segment that were encrypted prior to the encrypted data blocks within the subset). Thus, for a data segment that comprises data blocks DB1 through DB20, when that data segment is encrypted and stored for subsequent retrieval in its encrypted form using the present invention, a need may arise to retrieve plaintext versions of encrypted data blocks DB6 through DB10 and DB15, wherein the other data blocks of the data segment are not needed in their plaintext forms. A preferred embodiment of the present invention supports successful decryption of a subset of data blocks within the encrypted data segment (e.g., data blocks DB6 through DB10 and DB15) without requiring the decryption of the data segment's data blocks that are not members of the subset (e.g., data blocks DB1 through DB5, data blocks DB11 through DB14 and data blocks DB16 through DB20). Accordingly, the present invention supports the decryption of any arbitrary subset of the encrypted data blocks of a data segment without requiring decryption of any data blocks that are non-members of the arbitrary subset even if those non-member data blocks were encrypted prior to the encryption of the data blocks within the arbitrary subset.
Similarly, even if an entire encrypted data segment is to be decrypted, the present invention supports the decryption of the encrypted data blocks in a block order independent manner. Further still, the present invention supports the encryption of data blocks in a block order independent manner as well as supports limiting the encryption to only a defined subset of a data segment's data blocks (wherein such a subset can be any arbitrary subset of the data segment's data blocks).
Furthermore, as an improvement relative to the SIC/CTR mode of encryption/decryption, a greater degree of security is provided by this embodiment because the data that is subjected to key encryption includes the plaintext data (whereas the SIC/CTR mode does not subject the plaintext data to key encryption and instead subjects only its randomized bit vector to key encryption).
Preferably, the blockwise independent bit vector is a blockwise independent randomized (BIR) bit vector. As is understood by those having ordinary skill in the art, randomization in this context refers to reproducible randomization in that the same randomized bit vectors can be reproduced by a bit vector sequence generator given the same inputs. Further still, the blockwise independent randomized bit vector is preferably generated from a data tag that is associated with the data segment needing encryption/decryption. Preferably, this data tag uniquely identifies the data segment. In a disk encryption/decryption embodiment, this data tag is preferably the logical block address (LBA) for the data segment. However, it should be noted that virtually any unique identifier that can be associated with a data segment can be used as the data tag for that data segment. It should also be noted that rather than using a single data tag associated with the data segment, it is also possible to use a plurality of data tags that are associated with the data segment, wherein each data tag uniquely identifies a different one of the data segment's constituent data blocks
A bit vector generation operation preferably operates on a data tag to generate a sequence of blockwise independent bit vectors, each blockwise independent bit vector for reversible combination with a corresponding data block. Disclosed herein are a plurality of embodiments for such a bit vector generation operation. As examples, bit vectors can be derived from the pseudo-random outputs of a pseudo-random number generator that has been seeded with the data tag; including derivations that employ some form of feedback to enhance the randomness of the bit vectors. Also, linear feedback shift registers and adders can be employed to derive the bit vectors from the data tag in a blockwise independent manner.
The inventors also disclose a symmetrical embodiment of the invention wherein the same sequence of operations are performed on data in both encryption and decryption modes.
One exemplary application for the present invention is to secure data at rest in non-volatile storage; including the storage of data placed on tape, magnetic and optical disks, and redundant array of independent disks (RAID) systems. However, it should be noted that the present invention can also be applied to data in flight such as network data traffic.
These and other features and advantages of the present invention will be apparent to those having ordinary skill in the art upon review of the following description and figures.