A serious risk associated with the exchange of electronic information on open and unsecured networks, particularly on the Internet, concerns the modification of data during transfer. As a consequence, it is important to authenticate files received over a network to verify that they have neither been corrupted nor altered, and that they have not been sent by an impostor.
For example, when a user receives a file attached to an e-mail, such an authentication must be done when clicking on the file attachment icon. The attached files may include computer programs, text documents, graphics, pictures, audio, video, or other information that is suitable for use within a computer system. Likewise, if a document includes a link to an executable file or a software program, the user may wish to ensure that the received file has been sent by a trustworthy party prior to exposing his or her computer system to a program file that might include a “Trojan Horse” or a virus. As a result, the demand for secured transfer increases.
To improve data transmission security over computer networks and prevent digital forgery, a digital signature is commonly used to provide document and signer authentication, i.e. to control the source of a received file, and verify document integrity. Digital signatures are based upon cryptographic algorithms wherein security is provided through one or more keys independently of the algorithm, which may be freely published or analyzed. Two general types of key-based authentication algorithms for authenticating digital documents are well known in the art: symmetric and public-key.
In a symmetric algorithm, the encryption key and the decryption key are the same, and must be kept in secrecy by both parties, the sender and the receiver. The standard solution is to add a Message Authentication Code (MAC) to the transmitted documents. The MAC is computed with a one-way hash function over the document and depends on the secret key known by the sender and the receiver. The MAC allows the receiver to check that the received document has been sent by someone who shares the same secret-key and that the document has not been altered.
For example, the Secure Hash Algorithm (SHA) specified by the National Institute of Standards and Technologies (NIST), FIPS PUB 180-1, “Secure Hash Standard”, US Department of Commerce, May 1993, produces a 160-bit hash value. It may be combined with a key, e.g. through the use of a mechanism referred to as Keyed-Hashing for Message Authentication (HMAC), which is the subject of the Request For Comment (RFC) of the Internet Engineering Task Force (IETF) under the number 2104. HMAC is devised so that it can be used with any iterative cryptographic hash function, including SHA. Therefore, a MAC can be appended to the transmitted document so that the whole document can be checked by the receiver.
Public key algorithms, also known as asymmetric algorithms, use two different keys. One key is used for signing, and the other for verification. These algorithms are called “public-key” algorithms because the verification key can be made public. In contrast, the signature key needs to be kept secret by its owner, the signer.
Using digital signatures involves two processes, one performed by the signer to generate the signature and the other by the receiver to verify the signature. The signer creates a digital signature for a particular document by using his or her private key, and transmits both the document and the digital signature to the receiver. The verification process checks the digital signature received with the document using the public verification key. Properties of cryptographic digital signatures are such that they prevent extraction of someone's digital signature from one document and reattachment to another. Likewise, any changes in the signed document are detected, since any change will cause the signature verification process to fail. Furthermore, the signing key cannot be calculated from the verification key in a reasonable time.
In practical implementations, public-key algorithms are generally not used to provide signatures for long documents. To save time, signature protocols like the Rivest-Shamir-Adleman algorithm (RSA) or Digital Signature Algorithm (DSA) are often implemented with secure (one-way) hash functions. Basically, instead of signing a complete document, the signer computes a hash value of the document and signs the computed hash value.
Several signature algorithms are in use today. One popular signature algorithm is a combination of a hashing algorithm and an RSA encryption algorithm, e.g. Message-Digest-5 (MD5) with RSA, and SHA with RSA. Another popular signature algorithm is the DSA encryption algorithm, which may be used for limited purposes as a signature algorithm by private parties. Applied Cryptography, Second Edition, 1996, by Bruce Schneier, which is available from John Wiley & Sons, Inc., New York City, N.Y., presents a detailed description of signature and hashing algorithms and related encryption operations.
Once the digital signature of a file has been computed, it must be associated with the signed file. Digital signatures authenticating a file can be appended to the file they authenticate, e.g. as part of a file wrapper structure, embedded within the file or transmitted as separate files. Each of these methods has advantages and drawbacks.
Wrapping a file with delimiters and appending the digital signature at the end of the file is convenient, since both the signature and content travel together. Algorithms to sign and check signatures are simple and efficient. Conversely, the wrapper and signature will typically need to be removed before the file can be used. Thus, signature validation only occurs when the document is retrieved. If the document is later passed on or moved, it may be difficult to check again. Furthermore, the method is not compatible with standard file formats such as image, video, audio or executable files that cannot be recognized prior to authentication.
Embedding digital signatures into files has received considerable attention to protect copyrights attached to digital multimedia materials that can be easily copied and distributed through the Internet and through networks in general. A review of data embedding and data hiding techniques is described in “Techniques for data hiding” by W. Bender, et al., IBM Systems Journal, Vol. 35, Nos. 3&4, 1996. The most common form of high bit-rate encoding on images, as reported by Bender, is the replacement of the least significant luminance bits of image data with the embedded data so that the alteration of the image is imperceptible. This method is used for watermarking or tamper-proofing to detect image alterations. However a first drawback lies in the lack of standardization of how and where to integrate signatures into the different file formats, particularly on image, video, audio or executable files, and the added complexity of authenticating algorithms. Another important drawback is that merging the checking information and the file content affects the readability and quality of documents, e.g. digital images.
Maintaining signatures and data in separate files, e.g. signature files that may be stored on a server, has the advantage of supporting file authentication at any time in a simple and well understood way. However, the signature can be lost, accidentally removed, or intentionally removed in an attempt to cheat.
A more complex situation arises when authentication concerns a group of files, e.g. a document including attachments or links to other files. To deal with these frequent cases, a standard solution is to aggregate the files and generate a single MAC by applying a cryptographic hashing algorithm to the aggregation. But such a solution has a significant drawback, since the receiver must authenticate all the files that are aggregated, which is time consuming. To remedy this problem, other methods provide a separate signature file or MAC file along with the group of files. This MAC file includes individual check-values for the files, e.g. hash-values, as well as a digital signature or a MAC value for the group of files. Check-values of the signature file are compared with the corresponding values computed from the received files, and the digital signature of the group of files is verified. A classical method for generating a separate signature file for groups of data files is described in U.S. Pat. No. 5,958,051, “Implementing digital signatures for data streams and data archives,” to Renaud, et al. However, the method of using a separate signature file has several drawbacks as described above. Furthermore, if a file linked to the group has been withdrawn or is no longer accessible, none of the files of the group may be authenticated.
Therefore, there is a need for an efficient method and system for securing and verifying the authenticity and integrity of all types of files so as to remedy the shortcomings discussed above.