The new procedure is valid for documents which are originally in a digital or printed format, whose integrity could be verified even when printing or digitalization operations are performed on the document, since they enable to recognize and avoid or correct distortions produced by these operations. Among the digitalization procedures we can include those made with scanners or digital cameras, such as for example web cams or cameras built in mobile phones.
The main application of the present invention is to avoid fraud and falsification of documents, allowing as well the issuance of telematic means of official documentation, which could be printed to be used in in-person procedures, since the present invention describes a procedure which enables to extend the protection of current digital signature systems to the printed form, since it enables, among other things, a printed document to be printed and later digitalized without breaking the security chain. The nature of the new procedures herein described provides an additional advantage to the intrinsic protection of the digital signature, namely the possibility to locate and indicate in the digitalized document all the alterations which have been performed on the original content. Other applications of the procedures herein described are to make the business processes of printed document management and treatment more efficient, since it enables to automatically verify that several copies of printed documents are the same or that a printed document is exactly the same as another digital document.
The object of providing mechanisms which enable to verify the authenticity and integrity of printed documents is very old, and several solutions have been proposed, some of them based on the current digital signature systems and others using watermarking techniques.
The current systems of digital signature provide a plausible solution to the problem of digital document authenticity and integrity verification. However, they cannot be applied in the field of printed documents. The digital signature systems basically consist in obtaining a summary of the document, called hash, which is basically its characterization, encrypt it with a signatory's private code and attaching it to the document, which becomes signed. The verification of the authenticity and integrity of the signed document is performed extracting the attached hash, decoding it with the signatory's public code, and comparing that decoded hash with the new hash calculated from the document to be verified, using the same algorithm used in the to signing stage. If both hashes match perfectly, the authenticity and integrity of the document have been verified, otherwise, either the document has not been signed by the owner of the public code used in the verification (not authentic document) or the content of the document has not been modified. The algorithms for obtaining the hash most widely used by digital signature systems are SHA-1 and MD5, and it is because of them that the current digital signature systems are not useful once the signed document has been printed, since if the original digital document and the digital document to be verified differ in only one bit, the hash of both documents will be completely different and the verification of authenticity and integrity will be negative, even if the content of the document has not been altered. Therefore, as the printing and digitalization processes produce a great variety of distortions, it is necessary to introduce new procedures for obtaining hash or for characterizing documents, and new verification procedures to be able to resist or overcome the distortions produced by the printing and digitalization processes.
In the patent documents EP 0676877 A2 and ES 2182670 there is proposed the use of character recognition means, called OCR, on an original document to be signed and the document to be verified, and the application of the conventional algorithms for obtaining the hash on the outlet of OCR. This solution would enable to protect only the content in the text format of the document, without protecting elements such as images, marks or tables, which are frequently used due to the insertion of logos, seals, handwritten signatures, etc. Besides, there exist two distinguishing aspects, on the one hand, depending on the degradation suffered by the printed document and on the quality of the OCR recognition means, the conversion can be frequently wrong, and on the other hand, this mechanism does not enable to locate and indicate the alterations made on the content of the document.
The international patent application WO 2006/104374 A1 refers to the problem of protecting content other than text, indicating as possible hash functions some determined functions based on the Wavelet transform described in scientific publications.
U.S. Pat. No. 6,834,344 B1 describes a mechanism for marking a digital image using watermarking techniques, so that its authenticity and integrity can be verified once it has been printed and digitalized. Among the procedures described in said patent, it is worth mentioning the one used in the Discrete Cosine Transform (DCT) for obtaining the image hash. The characterization procedure consists in dividing the image into square blocks of n×n pixels, applying the DCT transform to each block, quantifying said coefficients, obtaining a hash or summary of the image collecting only some few quantified coefficients, encrypting said hash and inserting the encrypted hash in the image. The procedure for verifying the authenticity and integrity consists in extracting and decoding the hash inserted in the previous stage, thus obtaining a new hash from the image to be verified, making the same division into blocks and collecting the same DCT quantified coefficients, and comparing both hash, so that the authenticity and integrity of the image is verified if the distance between both hash is small.
The mechanisms described enable to verify the authenticity and integrity withstanding only a small part of the printing and digitalization distortions, such as the change in the color map, being unfit for most of the distortions introduced. Therefore, in a practical industrial application the procedures described in U.S. Pat. No. 6,834,344 would detect as falsified or not authentic a great number of documents which are in fact authentic. Firstly, all digitalization causes displacements in the content of the digitalized document with respect to the original digital document, which are frequently of a considerably great magnitude.
Secondly, there are geometrical distortions, which produce changes in the geometry of the document content. Among them, it is worth mentioning the inclinations, changes in the dimensions of the digitalized document with respect to the original one due to the scanner sensor, positive and negative curvatures and expansions and compressions. The changes in the document dimensions, the curvatures and expansions and compressions cause that, even finding the correct location of the document content, the optimal correspondence between the blocks of the original document, and the blocks of the printed and digitalized document do not match with a homogeneous grid placed on the content of the document to be verified, as described in U.S. Pat. No. 6,834,344, but instead for each block it is necessary to perform a fine synchronization stage which calculates the optimal coordinates of each region in an environment of the calculated initial position.
Another important aspect is the noise introduced by the printing and digitalization processes, which causes great changes in the DCT coefficients of the digitalized document. The present invention can include an optional noise filtering stage using current methodologies in image processing: medium-band filters, low-pass filters, band-pass filters or high-pass filters.
The present invention will enable to verify the integrity of the document overcoming the aforementioned distortions. Moreover, the present invention allows a great resolution when detecting alterations inserted in the document, that is to say, it enables to detect subtle changes in the content. Additionally, the procedures described in this invention enable to locate and indicate the alterations suffered by the document, thus enabling to detect behavior patterns in falsifications, or proving that the integrity of the document remains intact, even when said document has suffered accidental degradations such as stains and small tearing. It is therefore necessary to perform a stage of coarse synchronization which compensates these distortions.
Therefore, the technical problem which is solved by the present invention is to allow the authentication of a document even having said document undergone previous printing operations and later digitalization. Specifically, it enables to correct the displacements in the content of the digitalized document with respect to the original document and to correct the geometrical distortions caused by the scanner sensors, such as inclinations, positive and negative curvatures, expansions or compressions.
Besides verifying the integrity of the document, even having produced on said document the aforementioned distortions, the present invention also enables to locate and indicate the alterations suffered by the document verified with a great detecting resolution.