With the increasing use of open networked environments, such as the Internet, the demand for more secure systems for transferring shared information among networked computers has correspondingly increased. Today, the most serious risk associated to electronic information exchange on open, unsecured, networks, particularly on the Internet, is that digital data may be much more easily modified than ever before.
Most of today's transactions on the Internet, involve the access by the user to files on Web servers or mail servers directly from textual documents. On those open, unsecured networks, when a user selects and triggers an hyperlink on a Web page from a Web browser, or when a user clicks on the icon of a file attached to a received e-mail, it is becoming of the out most importance to authenticate the received data files prior to using them as intended. Such data files may include, but are not limited to, computer programs, text, graphics, pictures, audio, video, or other information that is suitable for use within a computer system.
By way of example of those security concerns, if an e-mail includes an attachment to an executable file or software program, the user may wish to be sure that it has been sent by a trustworthy party prior to exposing his computer system to a program file that might include a “Trojan Horse” or that could infect the user's computer with a virus. Thus, when a user on the Internet receives data from a server or from another user, it may be necessary for the receiving user to verify that the data received has not been corrupted or otherwise altered in some manner. Furthermore, the receiving user may need to verify that the data received was actually sent by the proper sending user rather than by an impostor.
To improve the security of data transmitted over computer networks while preventing for digital forgeries and impersonations, document authentication and signer authentication safeguards are being utilized.
Nowadays, digital signatures are the main cryptographic tools employed to provide document and signer authentication and integrity verification. Digital signatures are basically mechanisms through which users may authenticate the source of a received data file. Digital signatures achieve these results through cryptographic-key based algorithms, the security in these algorithms being based on the key (or keys), not in the details of the algorithm. In fact, the algorithms may be freely published and analyzed.
There are two general types of key based authentication algorithms well known in the art: symmetric and public-key. On symmetric algorithms the encryption key and the decryption key are the same and must be kept in secrecy by both parties, the sender and the receiver. On public-key algorithms digital signatures are derived through the use of “public keys”. Public key algorithms, also called asymmetric algorithms, are designed for using two different keys, so that one key, used for signing, is different from the second key, used for verification. Those algorithms are called “public-key” algorithms because the verification key can be made public. In contrast, the signature key needs to be kept secret by its owner, the signer. By the properties of cryptographic digital signatures there is no way to extract someone's digital signature from one document and attach it to another, nor is it possible to alter a signed message in any way without the change being detected. The slightest change in the signed document will cause the digital signature verification process to fail. Furthermore, the signing key cannot, in any reasonable amount of time, be calculated from the verification key.
Thus, using digital signatures involves two processes, one performed by the signer, which is the generation of the digital signature, and the other by the receiver of the digital signature, which is the verification of the signature. The signer creates a digital signature for the document by using his private signing key, and transmits both, the document and the digital signature to the receiver. Verification is the process of checking the digital signature by reference to the received signed document and the public verification key.
In practical implementations, public-key algorithms are often too inefficient to digitally sign long documents. To save time, digital signature protocols (i.e., RSA, DSA) are often implemented with secure (one-way) hash functions. Basically, instead of signing a complete document, the signer computes a hash-value of the document and signs the computed hash. Many signature algorithms use one-way hash functions as internal building blocks.
A hash function is a function that maps a variable-length input string (i.e. a document) and converts it to a fixed-length output string, usually smaller, called a hash-value. The hash-value serves as a compact representative image of the input string. Computing a one-way hash function usually does not require a key. As such, when the document is received, the hash function may be used to verify that none of the data within the document has been altered since the generation of the hash function. Thus, hash functions are typically limited in that the user may not necessarily infer anything about the associated data file, such as who sent it. In order to preserve the non-repudiation and unforgeability properties of digital signatures, when used in conjunction with a hash function, the hash function needs to be collision resistant. That is, it must be computationally unfeasible to find two messages for which the hash maps to the same value.
For authenticating a document that includes a plurality of attachments or links to other files, not only the document, but all the files that are linked to it must be authenticated. To deal with those very frequent cases, typically a single digital signature is generated by applying the digital signature algorithm to an aggregate of the document and all the files attached. When such signed document and attached files are received, the verification algorithm must be also applied to the same aggregate of the received document and attached files.
Now, the process of signing and verifying, and/or generating hash functions places an additional overhead on sending and receiving computational resources. Particularly, when a user receives a document that contains many attachments to large files, the verification of the aggregate of the received document and all attached files would imply a tremendous burden on the receiving computer resources and unacceptable delays on such a computer network environment.
In the prior art, there are methods for efficiently securing and verifying the authenticity of a plurality of data files, such as data files intended to be transferred over computer networks. Those methods for verifying the authenticity of groups of data files involve providing, along with the group of data files, a separate signature file which includes individual check-values for all data files (e.g., hash-values) as well as a digital signature for the group. The digital signature of the group of files is then verified using a computer system, and check-values in the signature file are compared with the corresponding values computed from the data files using the computer system. This class of methods that generate a separate signature file for groups of data files is represented by the approach described in U.S. Pat. No. 5,958,051.
Obviously, all those methods that assume the addition of checking information to a separate file have the drawback of indeed separating checked and checking information (i.e., the signature file). Thus, the latter can easily be isolated and removed intentionally, in an attempt to cheat, or accidentally just because the intermediate pieces of equipment or the communication protocols in charge of forwarding electronic documents and data files are not devised to manipulate this extra piece of information. Then, when authenticating a document having file attachments or links to other files, the checking information of the document and all attached files should rather be encoded transparently into the body of the document itself (i.e., in a manner that does not affect document's text format and readability whatsoever), so that it would remain intact across the various manipulations it is exposed to on its way to destination still enabling the end-recipient to verify the authenticity and integrity of the received document and the attached or linked files.