The present invention relates to a technique for assuring security in a computer network such as a software library and a transaction system between companies.
Hitherto, for example, with respect to an electronic signature, when the electronic signature (digital signature) for a long message is generated by using only open key ciphers, it takes long time. Consequently, a method of once compressing the message into short data and generating an electronic signature for the compressed data is used. In the method of compression, unlike ordinary data compression, it is unnecessary to reconstruct the original message from the compressed data but the data is compressed so as to have a certain kind of cipher characteristic. The hash function is then considered.
A message of a business transaction document or the like, for example, a document A "Mar. 10, 1996, To Susaki Company, I will purchase a car (catalog No. 1443) at 1,040,000 yen. Yoshiura" is input data to the hash function. The input data can be have any length. The hash function performs a process similar to encipherment to the input data, thereby compressing the data into short data having a predetermined length. For example, a hash value: 283AC9081E83D5B28977 is an output of the hash function. The hash value is also called a message digest or a finger print. Ideally, only one hash value substantially exists in the world for a message. In order to assure that "only one hash value substantially exists in the world", it is said that at least about 128 bits is necessary as the length of the hash value. Specifically speaking, the hash function has to have the following characteristics.
(1) one-way property: When it is assumed that an output value M' of the hash function is given for a message M, another message M(X) which might have the same output value M' as the output value M' should be difficult to be obtained from the viewpoint of the amount of calculation.
e.g. It is assumed that the birthday of Kazuo is February 22nd. In case of finding another person whose birthday is the same as Kazuo's birthday, when the birthday of each of about (365/2.apprxeq.) 183 people is checked, the person can be found. When the people is replaced by a message and the birthday is replaced by a hash value, a similar calculation method can be used. That is, when the length of the hash value is 160 bits, the total number of hash values is 2.sup.160. When messages of 2.sup.160 /2=2.sup.159 are checked on the average, another message having the same hash value as that of a certain message can be found. It is difficult to find another message having the same hash value from the viewpoint of the amount of calculation.
(2) collision free property: It should be difficult from the viewpoint of calculation amount to find two different messages M and M(X) having the same hash value from any messages and hash values.
e.g. When it is desired to find two any persons whose birthday is the same, if the birthdays of about 24 persons are checked on the average, two persons whose birthday is the same can be found. Similarly, when it is assumed that the length of the hash value is 160 bits and two different messages having the same hash value are searched, it is sufficient to check about 2.sup.160 /2=2.sup.80 sets of messages on the average. Although the number is much smaller than that in the case of the one-way property, it is still difficult from the viewpoint of calculation amount.
Various methods of realizing the hash function have been disclosed and a method of repeating substitution and transposition is the mainstream since the processing speed is overwhelmingly faster than a method using the open key cipher. A conventional technique showing the method of the process is disclosed in the following literature.
ISO/IEC 10118-2, "Information technology Security techniques--Hash-functions: Part 2: Hash-functions using an n-bit block cipher algorithm" (1994).
In the conventional technique, as shown in FIG. 21, a message 2501 which is desired to be compressed is first divided into a first division M.sub.1 2502, a second division M.sub.2 2503, . . . each having a predetermined length and the resultant data is inputted to a hash function 2507. In the hash function 2507, a repetitive processing 2505 of substitution and transposition is performed to the first division M.sub.1 2502 by using an initial value 2504 as a parameter, thereby obtaining a first intermediate output. Subsequently, by performing the repetitive processing 2505 of substitution and transposition to the second division M.sub.2 2503 by using the intermediate output as a parameter, a second intermediate output is obtained. Such processes are repeated and a final intermediate output is used as a hash value H 2506 to be obtained.
In the repetitive processing 2505 of substitution and transposition, an enciphering function such as DES (Data Expansion Standard) is used. The function is called "a hash function using a block cipher", standardized by ISO, and disclosed in the above-mentioned literature of the conventional technique. The details of the method are as follows.
An enciphering function 2509 is applied to the first division M.sub.1 2502 by using data obtained by transforming the initial value 2504 with a transforming function 2508, thereby enciphering the first division M.sub.1 2502. The exclusive OR 2510 is obtained every bit between the result of encipherment and the first division M.sub.1 2502 and is used as an intermediate output of the repetitive processing 2505 of substitution and transposition. The intermediate output is fed back to the repetitive processing 2505 of substitution and transposition and the enciphering function 2509 is applied to the second division M.sub.2 2503 by using data obtained by transforming the input data with the transforming function 2508, thereby performing the enciphering process. The exclusive OR 2510 between the enciphered data and the second division M.sub.2 2503 is obtained every bit and is used as an intermediate output of the repetitive processing 2505 of substitution and transposition. Such processes are repeated and the final intermediate output is used as the hash value H 2506 to be derived.
When the DES is used as the enciphering function 2509, the length of each of the first division M.sub.1 2502, the second division M.sub.2 2503, . . . is 64 bits, the length of the output of the repetitive processing 2505 of substitution and transposition is 64 bits, and the length of the hash value H 2506 is also 64 bits. The feature of the "hash function using the block cipher" is that the length of each of the divisions M.sub.1 2502, M.sub.2 2503, . . . is equal to the length of the output of the repetitive processing 2505 of substitution and transposition.
In the repetitive processing 2505 of substitution and transposition, methods which do not use the enciphering function 2509 such as DES are called "dedicated hash functions". There are SHA-1, RIPEMD-160, etc. which are being standardized by the internet standards MD5 and ISO.
Among them, MD5 is disclosed in "The MD5 Message Digest Algorithm", by R. Rivest, IETF RFC 1321 (1992).
In MD5, the message 2501 is divided into parts each having the length of 512 bits, thereby obtaining the first division M.sub.1 2502 of 512 bits, the second division M.sub.2 2503 of 512 bits, . . . . The resultant data is inputted to the hash function 2507. In the hash function 2507, the repetitive processing 2505 of substitution and transposition is performed to the first division Ml 2502 of 512 bits by using the initial value 2504 of 128 bits as a parameter, thereby obtaining the intermediate output of 128 bits. Subsequently, the repetitive processing 2505 of substitution and transposition is performed to the second division M.sub.2 2503 of 512 bits by using the derived intermediate output as a parameter, thereby obtaining an intermediate output of 128 bits. Such processes are repeated and the final intermediate output of 128 bits is used as the hash value H2506.
The feature of the "dedicated hash function" is that the length of the output of the repetitive processing 2505 of substitution and transposition is shorter than the length of each of the divisions M.sub.1 2502, M.sub.2 2503, . . . of the message.
The processing speed of MD5 is high and is 1,000 times as fast as that of the open key cipher. For example, data of about 100,000 bits is compressed by software on a personal computer using Pentium 90 MHz in about 1 milli-second (1/1000 second). Consequently, the electronic signature can be generated at high speed for a relatively long sentence or a figure.
The above known technique has, however, the following drawbacks.
(1) The processing speed of the hash value generated by the "hash function using the block cipher" is low.
The length of the input and that of the output to/from the enciphering function (block cipher) such as DES which can be used in the hash function is 64 bits which is short. The hash value obtained is consequently 64 bits which is not sufficiently long to realize the collision free property. Therefore, the hash value of 128 bits is generated by executing the calculation of the hash function twice by changing the initial value and the like, but a problem of slow processing speed occurs.
(2) It is feared that the hash value generated by the "dedicated hash function" is insufficient from the viewpoint of security.
As mentioned above, in the "dedicated hash function", the simple repetitive processing 2505 of substitution and transposition is performed to the data obtained by dividing the message. In this case, the length (for example, 128 bits) of the output value is shorter than the length (for example, 512 bits) of the input value. Consequently, input collision such that the same output is obtained from different two inputs can be relatively easily caused by paying attention to the relation between the message division data input of 512 bits and the compressed data output of 128 bits. That is, the hash function can be relatively easily broken.