The present invention relates to a technique for ensuring security of digital signature, data encryption, etc. in a computer network, and particularly to a method of converting a message to a hash value which is difficult to inversely convert.
A public key cipher system has been known as an encryption system for data such as electronic mail which is sent and received through a network. The processing flow based on the public key cipher system is as follows:
(1) A user beforehand distributes to transmitters a public key for encrypting an electronic mail to be sent to the user.
(2) A transmitter who wishes to send the electronic mail to the user encrypts the electronic mail by using the public key which is distributed from the user who is the intended recipient of the electronic mail, and then transmits the encrypted electronic mail to the destination of the electronic mail.
(3) The user decrypts the encrypted electronic mail by using the user's own secret key (having a numeric value different from the public key) when receiving the encrypted electronic mail which is encrypted by the public key distributed by himself/herself.
This public key cipher system has been applied not only to a data encryption technique, but also to a digital signature technique which is a technique for electrically verifying legitimacy of a contract or the like in electronic commerce using a network.
However, a lot of time is needed if a digital signature for a long message is generated by using only the public key cipher in the digital signature technique. Therefore, there has been proposed a method of temporarily compressing a message to shortened data and then generating a digital signature for the compressed data.
Here, for this type of data compression, it is unnecessary to compress the data so that an original message can be restored from the compressed data unlike normal data compression, however, it is necessary to compress the data so that the compressed data has a kind of encryption characteristic. A hash function has been proposed to implement such compression.
A message for an electronic commerce document or the like, for example, Document A: "To Taro & Co. Esq., I will purchase a car (catalog No. 1443) at one million and forty thousand yen. Mar. 10, 1996 Yoshiura" is input data to the hash function. There is no upper limit to the length of the input data.
The hash function subjects the input data to processing like encryption conversion to compress the input data to data having a fixed short length. For example, hash value: 283AC9081E83D5B28977 is an output of the hash function.
This hash value is called a message digest or a finger print, and ideally substantially only one hash value exists for one input data (message) in the world. In order to guarantee that "substantially only one exists in the world", it is generally recognized that the length of the hash value must be set to at least about 128 bits. More specifically, the hash function must have the following characteristics.
(1) One-way Property
When an output value of a hash function is given, it must be computationally difficult to determine another message which brings the same output value as the above output value.
For example, it is assumed that the birthday of Kazuo is February 22nd. In order to search for another person whose birthday is coincident with Kazuo's birthday, it is statistically sufficient to investigate the birthdays of about 183 (365/2) persons.
The same is satisfied even when the person is replaced by a message and the birth day is replaced by a hash value. That is, if the length of the hash value is set to 160 bits, the hash value can have any one of 2.sup.160 possible values (i.e., the total number of possible hash values is equal to 2.sup.160). In order to search another message having the same hash value as a message concerned, it is required to investigate messages of 2.sup.160 /2 (=2159), and this is computationally difficult.
(2) Collision Free Property
The message and the hash value may be any values (i.e., no limitation is imposed on the message and the hash value). At any rate, it must be computationally difficult to find out two different messages which have the same hash value.
For example, when any two persons having the same birthday are required to be found out, the birthdays of about 24 persons (=365.sup.1/2) need to be investigated in probability.
This is also satisfied even when the person is replaced by the message and the birth day is replaced by the hash value. That is, if the length of the hash value is set to 160 bits, in order to find out two different messages (any messages are possible) having the same hash value, it is necessary to investigate a set of messages of about 2.sup.160/2 =2.sup.80 on average. This number is smaller that that in the case of the one-way property, but this value is still computationally difficult.
Various methods have been proposed to implement the hash function which requires the above characteristics, and at present a method of repeating character-substitution and transposition to obtain hash values have mainly been used. The following paper 1 discloses the processing principle of the method:
ISO/IEC 10118-2, "Information technology--Security Techniques--Hash-functions: Part 2: Hash-functions using an n-bit block encryption algorithm" (1994)
The hash function as disclosed in the paper 1 will be described with reference to FIG. 27.
The left side of FIG. 27 is a diagram showing the processing flow of a general hash function, and the right side of FIG. 27 is a diagram showing the processing flow when an encryption function such as DES (Data Encryption Standard) is used for character-substitution/transposition repeating processing 3005 shown in the left side of FIG. 27.
As shown at the left side of FIG. 27, a message 3001 to be compressed is divided into a first section P.sub.1 3002, a second section P.sub.2 3003, . . . , for every predetermined length, and these sections are successively input to the hash function 3007.
The hash function 3007 subjects the first section P.sub.1 3002 to the character-substitution/transposition repeating processing 3005 by using an initial value 3004 as a parameter, thereby calculating a first intermediate output.
Subsequently, the hash function subjects the second section P.sub.2 3003 to the character-substitution/transposition repeating processing 3005 by using the first intermediate output as a parameter (in place of the initial value 3004), thereby calculating a second intermediate output.
The above processing is repeated until the data of the final section is input, and the finally calculated intermediate output is used as a hash value Hash 3006.
Here, in the paper 1, an encryption function (block encryption) such DES of USA encryption standard is used for the character-substitution/transposition repeating processing 3005. Such a hash function is called a "hash function using block encryption", and it has been standardized in ISO (International Organization for Standardization).
The "hash function using block encryption" will be described below.
As shown at the right side of FIG. 27, the first section P.sub.1 3002 is input to the encryption function 3009 with a parameter which is obtained by converting the initial value 3004 with a conversion function 3008. Exclusive OR 3010 is conducted between the encryption result based on the encryption function 3009 and the first section P.sub.1 3002 bit by bit, thereby calculating the first intermediate output based on the character-substitution/transposition repeating processing 3005.
Subsequently, the first intermediate output is fed back and then converted with the conversion function 3008. Thereafter, by using the first intermediate output thus converted as a parameter, the second section P.sub.2 3003 is input to the encryption function 3009. The exclusive OR 3010 is conducted between the encryption result based on the encryption function 3009 and the second section P.sub.2 3003 bit by bit, thereby calculating the second intermediate output based on the character-substitution/transposition repeating processing 3005.
The above processing is repeated until the data of the final section is input, and the finally-calculated intermediate output is used as the hash value Hash 3006.
When DES or the like is used for the encryption function 3009 in the "hash function using block encryption" shown at the right side of FIG. 27, the length of each section of the first section P.sub.1 3002, the second section P.sub.2 3003, . . . , and the length of the output of the character-substitution/transposition repeating processing 3005 are respectively equal to 64 bits, and thus the length of the hash value Hash 3006 is equal to 64 bits.
The feature of the "hash function using block encryption" resides in that the length of each section P.sub.1 3002, P.sub.2 3003, . . . of the message is equal to the length of the output of the character-substitution/transposition repeating processing 3005.
A hash function which does not use any encryption function such as DES in the character-substitution/transposition repeating processing 3005 is proposed. Such a hash function is called a "special-purpose hash function", and there are known MD5 which is an internet standard, SHA-1 and RIPEMD-16 which are being standardized in ISO, etc.
Of these special-purpose hash functions, MD5 is disclosed in the following paper 2:
R. Rivest, "The MD5 Message--Digest Algorithm," IETF RFC 1321 (1992) The processing flow of MD5 itself is the same as shown at the left side of FIG. 27, and it will be described with reference to the left side of FIG. 27.
First, a message 3001 to be compressed is divided into a first section P.sub.1 3002, a second section P.sub.2 3003, . . . every 512 bits, and these sections are successively input to the hash function 3007.
The hash function 3007 subjects the first section P.sub.1 3002 to simple character-substitution/transposition repeating processing 3005 by using an initial value 3004 of 128 bits as a parameter, thereby calculating a first intermediate output of 128 bits.
Subsequently, by using the first intermediate output as a parameter (in place of the initial value 3004), the hash function 3007 subjects the second section P.sub.2 3003 to the simple character-substitution/transposition repeating processing 3005, thereby calculating a second intermediate output of 128 bits.
The above processing is repeated until the data of the final section is input, and the finally-calculated 128-bit intermediate output is used as a hash value Hash 3006.
The feature of the "special-purpose hash function" resides in that the length of the output of the character-substitution/transposition repeating processing 3005 is shorter than the length of each section P.sub.1 3002, P.sub.2 3003, . . . of the message.
The above prior arts have the following problems.
(1) Problem of Hash function which has been hitherto proposed
1. Problem of "hash function using block encryption"
As described above, the "hash function using block encryption" uses an encryption function (block encryption) such as DES. In the block encryption, the data length of each of the input data and the output data is set to 64 bits. Therefore, the length of the hash value is equal to 64 bits. Further, in order to guarantee that "substantially only one hash value exists in the world" for one input data (message), it is believed that the length of the hash value must be set to about 128 bits or more as described above.
Accordingly, when a hash value of 128 bits is obtained in the "hash function using block encryption", it is necessary to perform the block encryption processing on each input data (64 bits) to the block encryption twice while varying the initial value or the like. That is, it is necessary to calculate the output (64 bits) twice for each input data (64 bits) to the block encryption. This reduces the processing speed of generating hash values.
2. Problem of "special-purpose hash function"
According to the "special-purpose hash function", unlike the "hash function using block encryption", a hash value of 128 bits can be obtained without performing the character-substitution/transposition repeating processing twice for each data into which the message is divided.
However, in the "special-purpose hash function", each data into which the message is divided is subjected to the simple character-substitution/transposition repeating processing to obtain hash values as described above. Here, the length of the output value of the character-substitution/transposition repeating processing (128 bits in the above case) is shorter than the length of the input value (512 bits in the above case). That is, the compression is performed in the character-substitution/transposition repeating processing.
Therefore, in the case where the message is divided into plural data every 512 bits, when there are assumed two messages in which the data of only the final sections thereof are different, in a process of compressing the data (512 bits) of the final section to the output of 128 bits through the character-substitution/transposition repeating processing, the outputs (i.e., hash values) of the two messages are coincident with each other with high probability. This deteriorates the collision free property.
3. The problems of 1, 2 also occur not only in the case where the hash function is applied to the digital signature, but also in other cases. For example, the same problems occur in a case where the hash function is applied to a data encryption system.
(1) Problem of public key cipher system
1. A lot of processing time is needed when long data are encrypted by using the public key cipher.
2. In the case where the public key cipher system is applied to the data encryption for electronic mail, etc., when the same electronic mail is transmitted to plural destinations with encryption, a transmitter must carry out the encryption processing on the electronic mail for every destination by using public keys which are distributed from the plural destinations in advance. That is, the encryption processing of the electronic mail must be repeated plural times, number being equal to the number of destinations.
On the other hand, when a receiver loses a secret key due to his/her erroneous erasure of the secret key from a file, the recipient cannot encrypt an encrypted electronic mail which is transmitted to the recipient while encrypted with a public key which was distributed to the sender by the recipient.