Field of the Invention
This invention, generally, relates to compressing encrypted data, and more specifically, to compressing encrypted data without using or requiring knowledge of the encryption key.
Background Art
Traditionally in communication systems, data from a source is first compressed and then encrypted before it is transmitted over a channel to the receiver. While in many cases this approach is befitting, there exist scenarios where there is a need to reverse the order in which data encryption and compression are performed. Consider for instance a network of low-cost sensor nodes that transmit sensitive information over the internet to a recipient.
The sensor nodes need to encrypt data to hide it from potential eavesdroppers, but they do not necessarily want to compress it as that would require additional hardware and thus higher implementation cost. On the other hand, the network operator that is responsible for transferring the data to the recipient wants to compress the data to maximize the utilization of its resources. It is important to note that the network operator is not trusted and hence does not have access to the key used for encryption and decryption of data. If it had the key, it could simply decrypt data, compress and encrypt again.
Related work in the area of compression and encryption can be classified into three main categories. The first category includes systems and methods for compressing encrypted data i.e. systems in which compression is performed prior to encryption. This category includes the systems/methods described in U.S. Pat. No. 6,122,378 (‘Data compression/encryption method and system’), U.S. Patent Application Publication No. 2007/0263876A1 (‘In-memory compression and encryption’) and U.S. Pat. No. 7,295,673 (‘Method and system for securing compressed digital video’). The second category includes systems and methods for simultaneously performing compression and encryption, wherein the encryption key (or a constant value, repeating cipher-text) is assumed known during compression. This category includes the systems/methods described in U.S. Patent Application Publication No. 2004/0136566A1 (‘Method and apparatus for encrypting and compressing multimedia data’), U.S. Pat. No. 6,122,379 (‘Method and apparatus for performing simultaneous data compression and encryption’), and U.S. Patent Application Publication No. 2008/0162521 (′Compression of encrypted data in database management systems). The main shortcoming of the systems in these two categories is that they do not allow encryption after compression and without knowledge of the encryption key.
The third category includes the systems/methods described in the papers ‘On Compressing Encrypted Data,’ M. Johnson, P. Ishwar, V. Prabhakaran, D. Schonberg and K. Ramchandran, IEEE Transactions on Signal Processing, October 2004 (Johnson et al. I), and ‘On Compressing Encrypted Data without the Encryption Key’, M. Johnson, D. Wagner and K. Ramchandran, Theory of Cryptography Conference 2004. In these methods Slepian-Wolf coding principles are used to compress data encrypted with a one-time pad and with a stream cipher. These procedures, however, do not do compression of data encrypted with block ciphers in chaining modes, which are commonly used for most data.
Block ciphers with a fixed key are a bijection, therefore the entropy of an input is the same as that of the output. It follows that it is theoretically possible to compress the source to the same level as before encryption. However, in practice, encrypted data appears to be random and the conventional compression techniques do not yield desirable results. Consequently, it was long believed that encrypted data is practically incompressible. In the above-mentioned Johnson et al. I paper, the authors break that paradigm and show that the problem of compressing one-time pad encrypted data translates to the problem of compressing correlated sources, which was solved by Slepian and Wolf (see D. Slepian and J. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Info. Theory, vol. 19, pp. 471-480, July 1973) and for which practical and efficient codes are known. Compression is practically achievable due to a simple symbol-wise correlation between the key (one-time pad) and the encrypted message. However, when such correlation is more complex, as is the case with block ciphers, the approach to Slepian-Wolf coding utilized in Slepian et al. is not directly applicable.
Therefore, a need exists for a method for compressing encrypted data without knowledge of the compression key, wherein the encryption of the data has been performed by one of the popularly used block ciphers.