The present invention relates to compression and encryption of data and more particularly to a data compression/encryption method and a system therefor which are designed to enhance the processing efficiency while reducing power consumption in performing compression processing as well as encryption processing on data.
In accompanying with rapid increase in utilization of communication facilities, there arises an increasing trend of compressing and encrypting (hereinafter referred to as compression/encryption or compressing/encrypting for short) the data with a view to enhancing the efficiency of communications and preventing unauthorized acquisition and falsification of the data for communication. In particular, in the case of wireless communications, since interception is easy because of narrow band, the compression/encryption is indispensable. On the other hand, in the portable type computers commanding high popularity in recent years, duration of a battery provides an important problem, giving rise to a demand for the data processing methods suited for economization of electric energy. For these reasons, the compression/encryption techniques which can ensuring high efficiency and low power consumption are demanded as being indispensable.
In the conventional compression/encryption techniques known heretofore, data is compressed at first and then written into a secondary storage such as a hard disk drive or the like, whereon the compressed data is read out from the secondary storage to be encrypted. In the encryption processing, same encryption processing as that for the non-compressed data is adopted without taking into consideration the fact that the data are compressed. Further, in the processing for restoring the original data from the compressed/encrypted data (this process will hereinafter be referred to as the decryption/decompression processing), the data is decrypted at first and then written in the secondary storage, whereon the decrypted data is read out from the secondary storage to be restored or decompressed.
In the conventional compression/encryption technique described above, the compression processing and the encryption processing are executed independent of each other without exploiting the possibility that the processing efficiency can be enhanced by integrating or combining together the compression processing and the encryption processing. Besides, because the compression processing and the encryption processing on one hand and the decryption processing and the decompression processing on the other hand are interlocked by way of the secondary storage, not only a time is taken for reading/writing data in/from the secondary storage but also large amount of electric energy is consumed for the read/write operation.
An object of the present invention is to provide a data compression/encryption method and a system therefor which are capable of enhancing the efficiency of processings involved in the compression/encryption as well as the decryption/decompression while reducing the power consumption by combining together the compression processing and the encryption processing for thereby rendering unnecessary to interlock the compression processing and the decompression processing on one hand and the decryption processing and the decompression processing on the other hand by way of the secondary storage.
In order to make unnecessary the interlock between the compression processing and the encryption processing on one hand and between the decryption processing and the decompression processing on the other hand by way of the secondary storage, an amount of data processed at a time is so limited that a memory capacity required for performing the compression/encryption processings and the decryption/decompression processings does not exceed the capacity of a main storage of a relevant information processing system. In other words, the amount of the data processed at a time is partitioned such that a memory capacity required for performing the compression/encryption processings and the decryption/decompression processings does not exceed the capacity of the main storage of the relevant information processing system, and then, the data compression/encryption processings are executed sequentially and repetitively on a partitioned-data basis to thereby compress and encrypt one set of data wholly.
More specifically, according to a first method of the present invention, an encrypting means and a decrypting means are provided in an information processing system which includes a compressing means and a decompressing means for the data, wherein the amount of data to be compressed and encrypted or the amount of data to be decrypted and decompressed is so set that the memory capacity required upon application of the compressing means and the encrypting means to the data as well as the memory capacity required upon application of a decrypting means and the decompressing means to the data does not exceed the capacity of the main storage incorporated in the relevant information processing system, and the application of the compressing means and the encrypting means or the application of the decrypting means and the decompressing means to the data of the amount as set is repeated to thereby compress and encrypt or decrypt and decompress the whole data.
Next, description will be directed to the enhancement of the processing efficiency owing to the combination of the compression processing and the encryption processing. As is described in xe2x80x9cDATA COMPRESSION HANDBOOKxe2x80x9d published by TOPPAN-PUB. (1994), pp. 21-247, there are known undermentioned methods (1) to (3) as the conventional compressing methods. In practical data compression programs, combination of the methods (1) and (3) or alternatively combination of the methods (2) and (3) is adopted in many cases.
(1) Method based on a fixed statistical model such as e.g. fixed Huffman coding.
(2) Method based on an adaptive statistical model such as e.g. adaptive Huffman coding.
(3) Dictionary base method such as e.g. LZ77 and LZ78.
The compression/encryption method according to the present invention is applicable to the above-mentioned methods (1) and (2). Furthermore, although the present invention can not find application straight-forwardly to the method (3), the invention is applicable to the combination of the methods (1) and (3) or the combination of the methods (2) and (3).
In the first place, description will be made of incorporation of the encryption processing in the compression processing based on the fixed statistical model. According to the compression method based on the fixed statistical model, occurrence probabilities of individual symbols are determined by resorting to a method of checking the frequencies at which the individual symbols make appearance in the data. On the basis of the occurrence probability of the symbol, correspondence information between the symbol and a bit string corresponding thereto (Huffman tree in the case of Huffman coding) is generated. In that case, the symbol making appearance at higher probability is assigned with a shorter bit string upon establishing the correspondence between them. Subsequently, the symbols contained in the data are translated into corresponding bit strings for thereby compressing the data. In the decryption processing, the compressed data and the symbol-bit string correspondence information (or occurrence frequencies or occurrence probabilities of the symbols for generating the symbol-bit string correspondence information) are received to restore the original data through reverse translation of the bit strings into the symbols, respectively.
In the compression method based on the fixed statistical model as mentioned above, the data restoration is impossible unless the symbol-bit string correspondence information can not be made use of. In this conjunction, it is h however noted that by encrypting only the symbol-bit string correspondence information, the intrinsic aim of the encryption can be achieved. Because the amount of the symbol-bit string correspondence information is extremely small when compared with that of the compressed data, overheads involved in the encryption processing as well as the corresponding decryption processing can be reduced remarkably when compared with the conventional method of encrypting the compressed data themselves. Thus, according to a second method taught by the present invention, the symbol-bit string correspondence information is encrypted in the compression processing based on the fixed statistical model.
With only the encryption of the symbol-bit string correspondence information, immunity of the encrypted data (degree of difficulty in cryptanalysis) may be insufficient in some cases. In other words, for the encryption described above, there may be conceived unauthorized cryptanalysis methods mentioned below.
(a) By acquiring a plurality of similar data and corresponding compressed data and comparing them, the trend underlying the correspondences established between the symbols and the bit strings according to the above-mentioned scheme is estimated.
(b) When a same symbol and/or a same symbol string occur repeatedly in one data, correspondence between the symbol and the bit string can be estimated by analyzing the repeating pattern contained in the corresponding compressed data.
For preventing the cryptanalysis mentioned in the above paragraph (a), such measures may be taken that correspondences between the symbols and the bit strings become utterly different for the data differing even a little. In the case of the compression based on the fixed statistical model, the length of a bit string corresponding to a symbol is determined in dependence on the occurrence probability of the symbol, as described previously. However, the bit string of a given length has a degree of freedom in respect to the array of xe2x80x9c0xe2x80x9d and xe2x80x9c1xe2x80x9d. By way of example, assuming that the length of the bit string corresponding to the symbol a has been determined to be xe2x80x9c4xe2x80x9d, there exists a degree of freedom in the bit array such as xe2x80x9c0000xe2x80x9d, xe2x80x9c0101xe2x80x9d xe2x80x9c1101xe2x80x9d etc. Under the circumstances, such arrangement is adopted that the bit string corresponding to the symbol is selected from a plurality of possible candidates such as mentioned above in dependence on an accidental or probabilistic factor such as a random number. By adopting this method, correspondences between the symbols and the bit strings become utterly different upon every computation for the data compression even for the data differing only a little (even for the utterly same input data). Consequently, it becomes difficult to estimate the trend in establishing the correspondence. Thus, according to a third method of the present invention, correspondence between the symbol and the bit string is realized by resorting to computation based on accident or probability in the second method described hereinbefore.
The cryptanalysis mentioned previously in the paragraph (b) can be prevented by changing a method of establishing the correspondences between the symbols and the bit strings in the course of the data processing. Thus, according to a fourth method of the present invention, procedure for establishing correspondences between the symbols and the bit strings is changed in the course of the data processing in the second method described hereinbefore.
In order to intensify the immunity of the encryption described above, a second encrypting means may be provided for encrypting further the encrypted data. To this end, however, it is sufficient to implement the second encrypting means such that more simplified processing as compared with that of the conventional encrypting means for the simple compressed data can be performed for the reasons mentioned below.
(a) The second encrypting means is destined for further encrypting the already encrypted data.
(b) Measures for coping with the cryptanalysis method such as comparison of the similar data, analysis of the repeating pattern and the like have already been taken.
Thus, according to a fifth method of the present invention, a simplified encryption is additionally carried out for the encrypted data in the second to fourth methods, respectively.
Next, description will turn to incorporation of the encryption processing in the aforementioned method (2) based on the adaptive statistical model. In the data compression based on the adaptive statistical model, predicted values of the occurrence probabilities of the symbols are employed instead of determining the occurrence probabilities of symbols by previously examining or checking the data, differing from the data compression based on the fixed statistical model. As the initial predicted values of the occurrence probabilities, it is presumed, for example, that the occurrence probabilities of all the symbols are equal. On the basis of such predicted value, the symbols are translated into bit strings, respectively, while the predicted value is modified (by incrementing the predicted value of the occurrence probability for the symbol processed while decrementing the predicted values of the occurrence probabilities for the other symbols). By repeating the translation of the symbol into the bit string and the modification of the predicted value of the occurrence probability, the whole data is compressed. In the decompression processing, the compressed data is received, whereon the data is decompressed or restored by repeating the reverse translation of the bit strings to the symbols by using the same initial predicted value for the occurrence probability as used in the compression processing and the modification of the predicted value of the occurrence probability.
In the case of the above-mentioned compression method based on the adaptive statistical model, when a certain portion or part of the compressed data is to be decompressed, it is necessary to decompress all the data portions preceding to the certain portion to thereby determine the predicted value of the occurrence probability at that time point. Accordingly, encryption of a leading portion of the compressed data is equivalent to the encryption of all the compressed data even when the succeeding data portion is not encrypted. Thus, according to a sixth method taught by the present invention, a leading portion or part of the compressed data is compressed in the compression processing based on the adaptive statistical model.
Similarly to the case of the fixed statistical model, there may be conceived for the method described above such unauthorized cryptanalysis methods as mentioned below.
(a) By acquiring a plurality of similar data and corresponding compressed data and comparing them, the trend of correspondence established between the symbols and the bit strings according to the scheme described just above is estimated.
(b) In general, because a predicted value of occurrence probability for the symbol does not change sharply in the course of the data processing, the correspondence as established between the symbol and the bit string will not change sharply either. Consequently, when a same symbol and/or a same symbol string occur repeatedly in one data, the similar patterns make appearance repeatedly in the compressed data. By analyzing the repetitive pattern, the correspondence between the symbol and the bit string is estimated.
The unauthorized cryptanalysis mentioned above in the paragraph (a) can be coped with the third method mentioned previously by establishing the correspondence between the symbol and the bit string by the computation based on accident or probability, similarly to the case of the method based on the fixed statistical model. The cryptanalysis mentioned above in the paragraph (b) can be prevented by changing the procedure of establishing the correspondences between the symbols and the bit strings in the course of the data processing on the basis of information other than the predicted value of the occurrence probability for the symbol. The length of a bit string corresponding to a symbol is determined in dependence on the predicted value of the occurrence probability for the symbol, as in the case of the method based on the fixed statistical model. However, in the bit string of a given length, there is a degree of freedom in respect to the array of xe2x80x9c0xe2x80x9d and/or xe2x80x9c1xe2x80x9d. Accordingly, the procedure for setting correspondences between the symbols and the bit strings may be changed independent of the occurrence probability of the symbol in the course of the data processing. Thus, according to a seventh method taught by the present invention, procedure for establishing correspondences between the symbols and the bit strings are changed on the basis of information other than the occurrence probabilities of the symbols.
According to the first method of the present invention described hereinbefore the amount of the data to be compressed and encrypted or the amount of the data to be decrypted and decompressed is so set that the memory capacity required for application of the compressing means and the encrypting means to the data as well as the memory capacity required for application of the decrypting means and the decompressing means to the data does not exceed the capacity of the main storage incorporated in the relevant information processing system. Thus, the compression/encryption processings as well as the decryption/decompression processings can be performed internally of the main storage, whereby the write/read operations of the interim results to/from the secondary storage is rendered unnecessary, which in turn means that improvement of the efficiency and reduction of the power consumption can be achieved.
According to the second method, the symbol-bit string correspondence information is encrypted in the compression processing based on the fixed statistical model, whereby the compressed data is rendered unable to be restored, which brings about same effect as the encryption of the compressed data itself. The amount of the compressed data is usually several kilo to several mega byte. By contrast, the symbol-bit string correspondence information is on the order of the amount equal to the number of different symbols multiplied by one byte, which is negligibly small. Thus, according to this method, efficiency of the encryption as well as the corresponding decryption can be enhanced.
According to the third method, the correspondence between the symbol and the bit string is determined by the computation which is based on accident or probability. Consequently, correspondence between the symbol and the bit string become utterly different completely every time the processing according to the instant method is activated. Thus, difficulty will be encountered in attempting to estimate the trend of the correspondences established between the symbols and the bit strings, which means that the degree of immunity of the encrypted data is intensified.
According to the fourth method, the procedure of setting the correspondences between the symbols and the bit strings is changed in the course of the data processing. Consequently, no repetitive pattern can make appearance in the encrypted data. Thus, the immunity of the encrypted data can be intensified.
According to the fifth method, the data encrypted by the method(s) mentioned above undergoes additional encryption by the second encrypting means, whereby the immunity of the encrypted data is intensified. Owing to such duplicate encryption, it is sufficient to implement the second encrypting means in a simple structure, whereby the encryption efficiency can be enhanced. Furthermore, because the decryption becomes simple as the corresponding encryption is simple, efficiency of the decryption can be enhanced as well.
According to the sixth method, the leading portion or part of the compressed data is encrypted in the compression processing based on the adaptive statistical model, whereby prediction of the occurrence probabilities of the symbols in the trailing data portion is rendered difficult, which means equivalently that the whole compressed data are eventually encrypted. Owing to the encryption only of the leading data portion, enhanced efficiency can be realized when compared with the conventional method of encrypting the whole compressed data with the efficiency of the corresponding decryption being improved as well.
According to the seventh method, the procedure of establishing correspondences between the symbols and the bit strings is abruptly changed on the basis of information other than the predicted values of the occurrence probabilities of the symbols. Thus, the repetitive pattern in the data can not make appearance in the encrypted data. Consequently, immunity of the encrypted data is intensified.