1. Field of the Invention
The present invention relates to a recording medium, a digital information verification apparatus, and a digital information verification system that identify a digital document based on feature amount information generated using meta information such as a header or property and content information which is document content to thereby support validation or verification of authenticity and integrity of a digital document and enable third-party certification thereof.
2. Description of the Related Art
Along with the recent advancement of information technology (IT), documents such as an administrative document, accounting book of private companies, and contract document that have been stored and managed in the form of a paper are now gradually digitized.
More specifically, a use of a scanner allows documents that have been stored in the form of a paper to be easily digitized. Further, practical application of a high-resolution image scanner makes digital archive of a large number of paper documents, which has not been unauthorized, legally acceptable, if a given security requirement is satisfied (e-document law, which came into effect on April. 2005).
Along with increasing demand for digital archive of such documents, the need of a technique for safely storing and managing a digital document is increased. It is said that it is necessary to satisfy technical requirements such as “falsification detection/prevention”, “identification of a creator”, “access management/control” and “historical management” in order to digitally store a document, which has been stored in the form of a paper, with evidence admissibility of the paper document maintained. A conventional document management system is not enough for satisfying such requirements. Thus, in recent years, development and launching of “originality assurance system” is accelerated.
As the most widely used technique in this “originality assurance system”, there is known a digital signature and time stamp. The digital signature is a technique capable of identifying the creator of a document (validating identity) as well as obtaining third party certification that no modification has been made to the document (validating integrity). The time stamp can certify the time at which a document is newly created in addition to the above function of the digital signature. The digital signature and time stamp are added to the entire document.
A conventional concept of the originality assurance system realized by using the techniques described above is to safely manage a document of the fixed final version as an original document, that is, to store so-called “paper” document in a locked document stack. In short, the system targets a document the location of its original document of which is clear. Under such an environment, the digital signature and time stamp are used as very effective techniques for assuring identity or integrity.
As prior arts relating to the present invention, the following technique 1) is known. Further, techniques 2) and 3) which have been obtained by further developing the technique 1) are known.
1) Digital Document Originality Assurance Technique
As a technique for assuring the originality of a digital document, Jpn. Pat. Appln. Laid-Open Publication No. 2000-285024 and Jpn. Pat. Appln. Laid-Open Publication No. 2001-117820 are known.
2) Digital Document Sanitizing Technique
A solution for a digital document sanitizing problem is proposed in Paper of Information Processing Society of Japan/Computer Security Group (CSEC) “Digital document sanitizing problem (2003/7/17) (2003-CSEC-22-009)”. In this technique, a digital document is divided into blocks, and partial signatures (hash values) for respective blocks are generated. The generated hash values are embedded in non-disclosure parts of the document body, allowing the parts having the same hash value to be detected as unchanged parts and the parts having different hash values to be detected as a change. Further, Paper “A Digital Document Sanitizing Scheme with Disclosure Condition Control” (2004, Symposium on Cryptography and Information Security (SCIS2004)) has proposed a digital document sanitizing technique capable of controlling whether or not to make additional sanitizing for a disclosure part.
3) Digital Document Partial Integrity Assurance Technology
Techniques for directly making partial operations for a digital document, such as addition, correction, sanitizing, and distributing the changed part have been proposed in Paper of 3rd Forum of Information Technology (FIT2004) “Partial integrity assurance scheme in consideration of correction/distribution of digital document” (M-066), Computer Security Group (CSEC), and Symposium on Cryptography and Information Security (SCIS). In this technique, the hash value for each part in a digital document is stored as identification information, and a corrected part and its identification information after correction are stored every time a correction has been made to the document. Then, by submitting the current document/its corrected parts and identification information of the current and previous versions, verification of the document can be made in the location where the document is submitted. This technique enables tracking of a history indicating when, how, who has made a change to which part in the document, in addition to identification of a changed part and certification of unchanged part.
However, the technique like the above 1) that adopts a typical digital signature/time stamp takes no consideration of originality assurance in a document, such as an application form or approval document, which is distributed from place to place with partial operations such as addition, correction, sanitizing made directly thereto. Consequently, by its tamper-proof nature, this technique impedes handling of such a document.
Problems in the conventional digital signature scheme that is featured in that a digital signature is added to the entire document will be described below.
A description will be made focusing a single/multiple documents and a filing or document constituted by a single/multiple pages (hereinafter, referred collectively as “binder”). In the conventional digital signature scheme, a digital signature is often added to the entire binder. In this case, although it is possible to detect whether any change has been made to the entire binder, it is impossible to detect which document (page) in the binder the change has been made to.
As a countermeasure against the above, a method that adds a digital signature in units of a document (page) in the binder and, at the same time, adds it also to the binder has been proposed. According to this method, it is possible to identify the document (page) to which a change has been made and detect that the remaining documents (pages) are unchanged.
In this method, however, it is impossible to certify that the unchanged documents have existed together with the changed document from the time point at which the entire binder is newly created. That is, if only a part of the document has been changed, reliability regarding the authenticity and integrity becomes lowered.
Further, in the case where a threat that a given document (page) in the binder is replaced by another one with a digital signature or, more concretely, and a threat that a given document (page) in the binder is replaced by a given document (page) of another binder occurs, it is not impossible to detect/certify the above fact based on a digital signature that has been added to the entire binder. That is, it is impossible to certify the order in which the documents (pages) in the binder are created and authenticity/integrity of the respective documents (pages) at the time point of creation, at the same time.
The above technique of 1) does not describe the above points and only aims to use a digital signature to simply store a digital document in a complete state (a state where no change has been made).
In order to solve the problem related to the technique 1), techniques as 2) and 3) are now widely spreading. That is, according to these techniques, if a part of the document is changed, certification that the remaining part is unchanged can be achieved.
However, there is the following disadvantage even in the techniques 2) and 3). Here, as a concrete example, unstructured document type, such as PDF (Portable Document Format) or Microsoft Office Word is considered. Such a format is roughly constituted by “Content information” representing the content itself of a document and “meta information” representing the header or property (version information, total number of pages, order of pages, angle of rotation of each page, notation, magnification, etc.) of the document. The conventional techniques 2) and 3) treat “content information+meta information” as one document and focus on the content of the content information.
For example, assume that content information constituted by “name”, “address”, and “telephone number” exists and that operation of hiding (sanitizing) “name” is performed at the time of disclosure. In this case, only certification that “address” and “telephone number” have not been changed has been made, while operation such as page rotation or addition of notation is out of consideration. That is, meta information such as the rotation angle of documents (pages) in the binder, notation, or the like is entirely out of consideration.
The above problems are summarized as follows in view of actual operation and actual use form.
It is assumed that a binder (paper document) is captured by a scanner to obtain a digital document (e.g., PDF format), and the digital document is stored with a digital signature and time stamp added thereto.
In an operating process of generating a digitized document from a paper document in the existing e-document law, a digitization operator (operator who performs scan operation) and digitization manager (manager who performs final review and approval) are typically specified.
The digitization operator first performs scan operation and verifies sameness between a digitized document and original paper document and, then, adds his or her digital signature to the digitized document. In the case where the number of binders (papers) reaches several tens of thousands of documents (pages), the binders are set in an auto document feeder provided in a scanner for automatic scan processing, and the verification of sameness between the digitized document and paper original document is often omitted in terms of operating effectiveness.
Thus, the digital signature of the digitization operator is added to the digital document without verification of the sameness and, then, approval processing is forwarded to the digitization manager. Whatever the case may be, verification of the sameness between the digitized document and original paper document needs to be performed in the subsequent time point. In this case, the digitization manager who gives final approval may perform the verification, or there may be a case where another operator who is in charge of the verification thereof may be designated.
If a given document (page) is inclined in an unintended direction (angle) as a result of the digitization process, viewability of the obtained document may be impaired. However, in the case where the above error related to the direction of the document (page) is corrected or must be corrected, the digitization operating process is started once again from the scan operation. This lowers operating effectiveness.
As a result, correction is inevitably made to the digitized document. However, the digital signature of the digitized operator has already been added to the digitized document. Accordingly, if the rotation error is corrected at this time point, the applied correction is determined to be a kind of a changing action according to the nature of the digital signature technique. Thus, the conventional technique cannot respond to such a situation sufficiently.
If a given document (page) is rotated by any angle, the content information itself is not changed from its original document. Therefore, the sameness between the digitized document and original paper document must be electrically certified. That is, it is necessary to verify/certify that no change has been made to the digitized document from the time when the document is newly created.
Similarly, even in the case where a comment such as a search keyword, notation, or the like, a note, or supporting data is added in each document (page) with respect to digitized document with a digital signature/time stamp that has been finally approved by the digitization manger or where a change of the order of documents (pages), partial insertion/deletion, change of the content, or the like is made, it is necessary to detect the above operations and validate/verify the authenticity and integrity of documents (pages) in the binder.