Field of the Invention
This invention relates to an image compression method, and in particular, it relates to a method of compressing document images containing text, as well as its application in document authentication.
Description of Related Art
Original digital documents, which may include text, graphics, pictures, etc., are often printed, and the printed hard copy are distributed, copied, etc., and then often scanned back into digital form. Authenticating a scanned digital document refers to determining whether the scanned document is an authentic copy of the original digital document, i.e., whether the document has been altered while it was in the hard copy form. Alteration may occur due to deliberate effort or accidental events. Authentication of a document in a closed-loop process refers to generating a printed document that carries authentication data on the document itself, and authenticating the scanned-back document using the authentication data extracted from the scanned document. Such a printed document is said to be self-authenticating because no information other than what is on the printed document is required to authenticate its content.
Methods have been proposed to generate self-authenticating documents using barcode, in particular, two-dimensional (2d) barcode. Specifically, such methods include processing the content of the document (text, graphics, pictures, etc.) and converting it into authentication data which is a representation of the document content, encoding the authentication data in a 2d barcode (the authentication barcode), and printing the barcode on the same recording medium as the original document content. This results in a self-authenticating document. To authenticate such a printed document, the document is scanned to obtain a scanned image. The authentication barcode is also scanned and the authentication data contained therein is extracted. The scanned image is then processed and compared to the authentication data to determine if any part of the printed document has been altered, i.e. whether the document is authentic. Some authentication technologies are able to determine what is altered, and/or where is altered, some merely determine whether any alterations have occurred.
JBIG2 is an international standard for compression of bi-level images, in particular document images containing text. It utilizes a pattern matching and substitution method, by which the image is segmented into multiple symbols and a symbol dictionary is developed; each symbol in the document image is matched to a symbol in the dictionary, and encoded by an index to the dictionary entry and the location and size of the symbol in the image.