Printing and copying paper documents plays a central role in the dissemination of information in the office environment. Managing and maintaining the organization of paper documents and their relationship to their digital counterparts is becoming increasingly difficult due to the explosion in the number of documents and the number of people simultaneously working on them.
A number of methods of improving the management of this complexity exist that are based on maintaining a database of relationships between digital versions of a document and their paper representations. When such a database exists, upon identifying a document, a copying device may query the database for the digital version of the document and execute a number of different options based on the original description of the document. Such options may include reprinting from the original of the document or printing an updated version of the document, if such has been registered with the database.
One method of maintaining a database of the relationships between digital versions of a document and their paper representations is based on printing a machine readable mark on the document, such as a bar code, that identifies the document that has been printed. This method has the disadvantage that it requires special marks on the document. These marks can be visually distracting. In addition, the printing of such marks may require special inks or papers, thereby increasing the cost of the system.
Another method of maintaining a database of the relationships between digital versions of a document and their paper representations is based on image indexing. In this method, a distinct property of the document is stored in the database. The property can be recovered from a scan or image of the document and can distinguish the document from other documents. The Fourier magnitude of a thumbnail of a document is a known example of such a property. One disadvantage of this method is that the method cannot discriminate between documents that share similar image content. Another disadvantage is that similar images can be confused, if extraneous marks have been added to the document, either by annotation, or by wear and tear of the paper on which the document is printed.
A further method of maintaining a database of the relationships between digital versions of a document and their paper representations is based on extracting a unique property of the medium on which the print is being made. An example of such a unique property is the image of the fibre structure of a section of the surface of the paper, or any other printing medium on which the document has been printed. A disadvantage of this method is that it requires a fixed portion of the document to be left largely unprinted, thereby restricting the acceptable geometry of the source document. Such a restriction is displeasing to the user and reduces the utility of the method.