1. Field of the Invention
The present invention relates to authenticating the source and integrity of transmitted or stored information. In particular, the present invention is a method and device for generating approximate message authentication codes (AMAC). An AMAC provides absolute authentication of the source or origin of a received message and permits verifying approximate integrity between the original message and the received message.
2. Discussion of Related Art
It is often desirable to ensure that the source or origin of a message or other communication is who it is represented as being and that the received message is the same as the original message. One well-known way to provide this type of authentication is a Message Authentication Code (MAC). A MAC is generated for an original message M and is sent with the message. This allows the recipient of a message to verify that the received message M′ was actually sent from the purported sender and that the message has not been altered from the originally transmitted message M. This can be done by the message sender applying a one-way hash function (described below) on a secret key (also described below) and the message M. The result is a MAC. The recipient may receive the message M′ and the MAC. If the recipient has the secret key, she can then apply the same hash function to the key and message M′. If the two MAC values are the same, the messages are identical. Because the secret key correctly computed the MAC to obtain the hash value, the message originated from the purported sender. Because the MAC values are the same, the recipient has also verified that the received message M′ has not been altered from the original message M.
A hash function is a function which takes an input string of any length (often called a pre-image) and computes a fixed-length output string (often called a hash value). In the example above, the pre-image is the original message M. A one-way hash function is a hash function for which it is computationally intractable to find two pre-images with the same hash value. Briefly, a one-way function is a function that is easy to compute but hard to invert on an overwhelming fraction of its range. In a good one-way hash function, given a hash value, it is computationally infeasible to determine any pre-image that hashes to that value. Another type of hash function is a collision resistant hash function. One important feature of a collision resistant hash function is that it is computationally intractable to generate two pre-images which hash to the same hash value. In a typical collision-free, one-way hash function, a change of one bit between pre-images results in an expectation that each bit of the hash has about a 50% chance of changing. Therefore, even a single bit difference results in an entirely different hash value.
A secret key is typically a large number that is known only to certain users, thus the term “secret.” “Secret key” as used here refers to a secret key in a MAC or symmetric encryption algorithm (symmetric cryptosystem). In a typical symmetric cryptosystem, the users, for example the sender and the recipient, agree on a cryptosystem and agree on the secret key. In the case of a MAC, the sender uses the same secret key to generate the MAC as the recipient uses to verify the MAC.
FIG. 1 is a block diagram of a typical cryptography device 100, such as may be used in a symmetric cryptosystem or MAC. The device 100 has a one or more processors 102 including one or more CPUs, a main memory 104, a disk memory 106, an input/output device 108, and a network interface 110. The devices 102–110 are connected to a bus 120 which transfers data, i.e., instructions and information, between each of these devices 102–110. The processor 102 may use instructions in the memories 104, 106 to perform functions on data, which data may be found in the memories 104, 106 and/or received via the I/O 108 or the network interface 110.
For example, a plain text message M may be input via the I/O 108 or received via the network interface 110. The plain text message may then be hashed using the processor 102 and key stored in some memory (such as main memory 104 or disc memory 106). The result of this hash (i.e, the MAC) may be transmitted (along with the plain text message M) to another party via the network interface 110 connected to a local area network (LAN) or wide area network (WAN). Similarly, a MAC may be received via the network interface 110 and verified using the processor 102 and key stored in some memory (such as main memory 104 or disc memory 106) and perhaps software stored in the main memory 104 or the disk memory 106.
FIG. 2 illustrates a network 200 over which cryptography devices 100 may communicate. Two or more cryptography devices 100, 100′ may be connected to a communications network 202, such as a WAN which may be the Internet, a telephone network, or leased lines; or a LAN, such as an Ethernet network or a token ring network. Each cryptography device 100 may include a modem, network interface card, or other network communication device 204 to send encrypted messages and/or message authentication codes over the communications network 202. A cryptography device 100 may be a gateway to a sub-network 206. That is, the device 100 may be an interface between a wide area network 202 and a local area (sub) network 206 (or it may be an interface to a storage device, e.g., a disk controller).
In certain situations, even the slightest change in the message is unacceptable, such as in electronic payments or precise target coordinates. In such applications, the strict determination of even a one-bit change can be critical. In some applications, however, such as voice or imagery, this strict requirement is not needed and not desirable for the reasons discussed. The message may be slightly altered after the sender generates the MAC. This may happen, for example, if the message is a still image (i.e., a picture) and, after the hash value is generated, “hidden text” is added to the image. Hidden text may be a “digital watermark” or “fingerprint” added to an image to identify the origin of the image. A content provider may include hidden data on an image it posts on the Internet. The hidden data may be used as evidence of ownership and copying if another party misappropriates the image. Although the hidden data involves no illicit “tampering” that should cause the recipient to reject the image, some of the information has been changed. This change causes the hash value of the received message M′ (which contains the hidden data) to be entirely different from the hash value of the original message M (which does not contain the hidden data). This leads the recipient to conclude that the received message M′ has been forged or altered and is unreliable. The same problem arises for images and voice if noise is introduced into the message during transmission.
“Lossy compression” is another application where information may be lost or altered in a way that should be acceptable. For example, a still image (a picture) may be compressed using a lossy compression technique such as JPEG after the MAC is generated. JPEG is a data compression technique that eliminates redundant information from a still image. As a result, some of the information in the original image (message M) may be lost when it is compressed and later decompressed. Nevertheless, the changes in the received message (decompressed image M) are not illicit tampering, nor is the image a forgery. Therefore, this compressed-decompressed image M′ should have sufficient integrity to be accepted. However, because there has been some change in the data, a MAC using a hash function will show that the integrity of the image has been compromised, and the image will be rejected.
There is a need to provide a message authentication code that permits absolute authentication (i.e., the sender is the party identified as the purported sender) and approximate integrity (i.e., the message has undergone no more than some acceptable amount of modification). For example, the recipient should be able to determine that the differences between the original message M and the received message M′ are only slight. This permits some integrity loss due to hidden data, noise, some instances of lossy compression, or other change, but prevents all out forgeries, substantial changes in content, or “cut-and-paste” attacks.
Therefore, it is an object of the present invention to provide a method and device for generating an approximate message authentication code which provides absolute authentication and approximate integrity.
It is another object of the present invention to provide a method and device which permits a recipient to accept certain messages as authentic and sufficiently unaltered, even if there is a slight change in the message.