Encryption techniques are used to provide confidentiality of sensitive data to be transmitted over insecure communications channels or to be stored in insecure computer systems or publicly accessible databases. An encryption algorithm reversibly transforms plaintext into ciphertext data which can be transformed back to the original form by a decryption algorithm only by the authorized entities in a possession of the corresponding cryptographic key, which needs to be kept secret. A symmetric-key encryption algorithm, such as a block or stream cipher, uses the same key for encryption and decryption.
As it is known, block ciphers operate on fixed-size blocks of data symbols on the block-by-block basis, whereas stream ciphers operate on variable-length sequences of data symbols on the symbol-by-symbol basis. According to the prior art techniques, block ciphers can be used in the so-called Electronic Code Book (ECB) encryption mode by encrypting the plaintext data repeatedly by the same key.
Conventional stream ciphers do not show satisfying security performances if they are used in the ECB mode and hence require a new Initialization Vector (IV) for each new encryption by the same key. The secret key and IV are combined together by an initialization algorithm prior to the encryption. Such IV's need to be transmitted or stored together with the encrypted data, but do not have to be kept secret.
In common data processing systems, to avoid compatibility problems related to various applications, it is necessary to preserve the data format when all or selected data are encrypted. To the purposes of the present invention, for data expressed as a sequence of data symbols, the data format is defined in terms of the sequence length and the (finite) alphabets to which individual symbols belong. As an example, for alphanumeric data, particular symbols, depending on a position in a data sequence, can be numeric or may correspond to alphabet letters or may be mixed. The preserved data format thus means that the output and input data sequences have the same number of symbols and that, for each position in the data sequence, the ranges of values (i.e., the alphabets) of the output and input symbols are the same, as specified.
In some data processing systems, the range of symbol values for a given position in the data sequence may depend not only on the position, but also on the values of other, typically surrounding symbols. For example, in the JPEG 2000 image coding standard, any byte is forbidden to assume values in the range from 90 to FF if the preceding byte is FF, in hexadecimal notation, as well as the value FF if it is the last byte in a sequence. In such systems, it is important to ensure that the output data sequence obtained after applying the encryption algorithm to all or only selected data is compliant with the same syntax rules as the input data sequence, provided that the syntax rules can be algorithmically verified. In particular, the syntax rules may relate to the data format depending on the symbol position and the values of other symbols in the data sequence as well.
With further reference to the ECB mode, it is observed that the basic security requirement regarding the ECB mode of operation of a symmetric-key encryption algorithm is that, without knowing the secret key, it should be computationally infeasible to compute the decryption and encryption functions, from any given number of plaintext-ciphertext pairs assumed to be given in the known plaintext-ciphertext scenario. In particular, it should be computationally infeasible to reconstruct the secret key from any given number of known plaintext-ciphertext pairs generated by using the same secret key. In the related key scenario, the known plaintext-ciphertext pairs generated from the keys related to a given secret key are also allowed. In order to satisfy the basic security requirement, each ciphertext symbol should depend on all plaintext symbols and all secret key symbols in a sufficiently complicated way that is not vulnerable to algebraic and/or probabilistic cryptanalytic attacks. Nevertheless, the encryption and decryption functions should allow a relatively simple representation that is suitable for software and/or hardware implementations, provided that the secret key is known.
Document US-A-2008-0170693 describes an encryption method aiming at preserving the data format which consists in using the well-known Feistel construction with at least three rounds and the round function based on conventional hash functions or block ciphers. a three rounds block ciphers based on conventional hash functions. For each symbol, the data format is controlled by using a combining operation in the round function that is based on modular arithmetic where the modulus determines the alphabet size to be achieved for that symbol.
Document US-A-2006/0227965 discloses an encryption method consisting of dividing the plaintext sequence into parts and repeatedly encrypting each part, one at a time, until the intermediate data sequence composed of all the current parts, unencrypted or encrypted, satisfies the specified syntax rules. Accordingly, initially, the intermediate data sequence coincides with the plaintext sequence and at the end, when all the parts get encrypted, it becomes the ciphertext sequence. The decryption is performed in the opposite direction, by decrypting the parts in the reverse order, each time repeating the decryption of each part until the intermediate data sequence composed of all the current parts, undecrypted or decrypted, satisfies the specified syntax rules. The encryption functions used for encrypting the individual parts may be arbitrary as long as their inputs and outputs are compliant with the lengths of the parts.
The paper of J. Golić, “Modes of operation of stream ciphers” Proceedings of Selected Areas in Cryptography—SAC 2000, Lecture Notes in Computer Science, vol. 2012, pp. 233-247, 2001, describes several generic constructions for converting conventional stream ciphers as keystream generators into block ciphers, keyed hash functions, and hash functions. The starting point, common to all the constructions, is to modify any conventional keystream generator, which produces a keystream sequence independently of the plaintext sequence, by introducing the current plaintext symbol into the next-state function in order to produce the next keystream symbol to be combined with the next plaintext symbol into the next ciphertext symbol. The decryption is performed in the same direction, but the reconstructed next keystream symbol is inversely combined with the next ciphertext symbol into the next plaintext symbol, and so on. In this way, the keystream sequence becomes plaintext dependent and hence potentially useful for obtaining (keyed) hash functions, to be used for message authentication, and block ciphers, to be used for message encryption, also in the ECB mode of operation. Such an unconventional stream cipher is called a stream cipher with plaintext memory.