The following relates to processing of documents, and more particularly to processing hardcopy documents in an environment, including both manual and automated operations. It finds particular application in conjunction with applications in business fields such as finance, banking, insurance, taxation, health care and pharmaceuticals, retailing, law firm docketing, centralized scanning service, etc. for document management, fraud prevention, inventory tracking, cryptography, ID cards, parts marking, product tagging, and the like. However, it is to be appreciated that the following is amenable to other like applications.
Those engaged in institutional document processing will manually mark hardcopy documents with stamps in processing steps, indicating the nature or condition of the documents, date of receipt, future processing instructions, and other such information. More particularly, large-scale document processing centers will often mark paper documents with stamps containing instructions for a future processing step (e.g., “MAKE COPIES”, “SEND TO SECTION A4”, “DESTROY EXTRA COPIES”, “START OF DOCUMENT”, “END OF DOCUMENT”, etc.). These markings are read by people as part of ongoing workflow. The marks are in human readable form and cannot be easily recognized by a computer system, and hence are regarded as an obstacle for migration to more automated document processing.
For example, as shown in FIG. 1, a hardcopy document 10, which is to be processed in a document processing facility, may include computer generated text and/or images 12, as well as handwritten notes 14. A conventional stamp image 16 may be applied to document 10, providing some form of information to a human operator (e.g., in this example, the human operator is informed the document has a “RAISED SEAL”). This example illustrates a situation where document 10 includes a mixture of computer generated text and/or images 12, handwritten words 14, as well as stamp image 16, all with different characteristics in a complicated document.
One attempt to automate processing of hardcopy documents is by employing a computer system which attempts to identify and “read” the human-readable text. However, while there have been attempts to improve a computer's ability for this reading (e.g., optical character recognition), such systems are very expensive, take significant amount of time to implement, may require the complete reorganization of a document processing center, and have issues relating to reliability, particularly with complicated documents, such as document 10.
Another approach to assist in automating the processing of hardcopy documents is to apply a bar code which may be printed, applied as stickers or otherwise attached to the document, and which may have information as to a further processing step. However, a problem with bar codes is they cannot be read by human operators. Further, bar codes applied as stickers can fall off a document, so the document cannot be read, and may also gum-up a scanner being used to scan the document.
Still a further attempt at automating the flow of hardcopy documents is through the use of paper user interfaces (UIs). In a paper user interface system, a user accesses the system (or device) by use of a cover sheet, i.e., a piece of paper with machine readable code and, possibly, handwritten instructions. Typically, the hardcopy media is scanned, and the machine readable code is decoded, and any resulting instructions are executed by the system.
For example, U.S. Pat. No. 5,682,540 to Klotz, Jr. et al. discloses the use of paper forms with machine readable and human readable information as document surrogates or tokens for electronic files. An example of a Paper UI system is the Xerox FlowPort™ system which employs paper forms called PaperWare® forms which enable users to scan, store, email, Internet fax and remotely print electronic documents. This approach can be tedious and relatively inefficient, as it requires a special-purpose cover sheet to be used for each job which then requires the special purpose paper to always be in stock. Further, when as in some document processing situations different instructions may need to be applied to different pages of a document (e.g., page 5 of a document may need to go to person A and page 15 of the same document may need to be copied 5 times, etc.), it would be necessary to provide a cover page of each affected page of the document. Still further, unlike the physical stamp which is human readable, the paper interface concept does not provide the instructions on the same page of the document. Due to at least these differences, the use of cover sheets would lead to errors caused by inappropriately combining with the wrong cover sheets and document pages, creating processing errors.
Another drawback of the foregoing concepts, including bar codes, and paper user interface systems, are that they are very limited in the amount of information which may be transmitted when they are provided as one-dimensional codes. There are, however, other technologies known as two-dimensional codes which deploy encoding schemes where significantly more data may be incorporated in substantially the same physical area. Included among these two-dimensional coding concepts are glyph codes, such as DataGlyph codes developed by Xerox Corporation.
For example, U.S. Pat. No. 5,168,147 (Bloomberg), incorporated herein by reference, discloses binary image processing techniques for decoding bitmap image space representations of self-clocking glyph shape codes of various types (e.g., codes presented as original or degraded images, with one or a plurality of bits encoded in each glyph, while preserving the discriminability of glyphs that encode different bit values) and for tracking the number and locations of the ambiguities (sometimes referred to herein as “errors”) that are encountered during the decoding of such codes.
Another glyph concept is disclosed in European Patent 469,864 B1 (Bloomberg et al.), incorporated herein by reference, which discloses self-clocking glyph shape codes for encoding digital data in the shapes of glyphs that are suitable for printing on hardcopy recording media. Advantageously, the glyphs are selected so that they tend not to degrade into each other when they are degraded and/or distorted as a result, for example, of being photocopied, transmitted via facsimile, and/or scanned into an electronic document processing system.
Still further, U.S. Pat. No. 6,873,430 discloses a knowledge management system and method thereof using Xerox DataGlyph stickers, and U.S. Patent Application 20040205626 discloses user interface identification and service tags for document processing system, both documents hereby incorporated in their entireties herein.
However, none of the above concepts deal specifically with processing hardcopy documents which require both manual and automated processing and, more particularly, with the unique issues raised in large document processing centers, where a number of operations are undertaken manually, while others are to be accomplished automatically.
Accordingly, there is a continuing need in the art for improved techniques for document processing, which can effectively manage documents in a domain, that processes hardcopy documents using both manual and automated operations.