1. Field of the Invention
The present invention relates generally to authentication of documents, and relates more particularly to authentication of information in a document that is transformed from an original text document into another representation.
2. Description of Related Art
Documents are often transformed into other representations such as by printing, scanning or transmission. The term document is used here to mean any electronic or hard copy representation, that contains or conveys information, with a content of text, symbols, graphics, images, formatting and so forth. Hard copy representations are meant to include paper and any other tangible medium in which content representations may be fixed. In the case of document transmission, transformation of the document often occurs as part of the transmission process, such as with transmission by facsimile or email. In such transmissions, the document may be transformed to an image or a text document. An image document generally refers to a representation of information suitable for display or transmission that is usually viewed as an image or picture. A text document generally refers to a representation of information with text characters, such as a document that includes ASCII type characters.
Documents are often secured from access or tampering. For example, during electronic transmission of a document, encryption is often used to provide an authentication technique where it is believed that only the sender is able to generate the content of a given communication. The sender encrypts the communication, for example, with a key that is part of a private/public key pair, and sends the encrypted information to the receiver over a communication link. The receiver decrypts the communication using the other part of the private/public key pair, and reviews the transmitted content. Sometimes an authentication code or element is transmitted with the communication, such as a checksum or time and date stamp. The private/public key pair code contribute to identifying the source and the code contributes to assuring the receiver that the received communication is authentic and the same as that transmitted by the sender. The same types of security may be applied to documents for storage or other applications not necessarily related to communication. The private/public key pair encryption is often referred to as asymmetric encryption, since the private key is a secret key, while the public key is generally available. Other types of encryption permit only the author and a limited number of trusted parties to access the content of a document. This type of encryption is often referred to as symmetric key encryption or shared secret key encryption.
Security for electronic documents is also an important issue that may involve authentication with regard to unauthorized copying. For example, an electronic document may be “watermarked” to provide a secure indication of information associated with the document. Such associated information can include the author, owner, time and date created, particular characteristics related to the document and so forth. The watermark is often not visible as part of the document, but is in the form of an electronic signature typically embedded in the data of the file containing the document. The watermarked image is static, in that it does not reflect any information associated with transformations of the document, as may occur when a document is printed, scanned, digitized, transmitted such as by facsimile, and so forth. Typically, an image watermark is used to secure intellectual property rights in the image, or provide evidentiary support for claims of authorship, ownership and the like. The watermark may include indicia related to securing or authenticating the document, such as a code or checksum that reflects the state of the document when the watermark was applied. The code or checksum can then be used to verify that the document did not change in content so that the document can be checked for tampering or modification.
Authentication may also be an issue involved in the comparison of two electronic copies of a document, for example. Typically, the electronic documents are compared on a unit-by-unit basis, such as byte by byte or word for word. This type of comparison and authentication typically assumes the two documents are in the same format and are generated by the same program or software. The authentication is conducted based on direct electronic comparisons between the documents. This type of comparison or authentication is specific to a particular format of document and does not relate to image inspection or authentication. In addition, this type of document authentication does not apply to transformed documents, including documents transformed and transmitted over a communication link. Moreover, this type of document authentication by comparison does not operate directly on hard copy documents or document images.
Another issue involving authentication of documents relates to transformation of a document from one format to another. If a document exists as an original in one format, and is then converted to another format due to transmission, scanning, printing and so forth, authentication of the transformed document may be difficult. For example, a printed document derived from an electronic text document is difficult to authenticate against the original document without resorting to manual techniques, such as by inspection of both document forms by a reviewer. It would be desirable to obtain a technique for being able to automatically compare a set of documents where one is a transformed version of the other.