1. Technical Field
The present invention relates generally to document processing, and more particularly, to the joining of front-end and back-end document processing.
2. Related Art
Despite the evolution of electronic communications, the requirement of formalized documents as a communications medium remains in many industries. The content and layout of documents vary according to industry. For example, documents may include: correspondence, checks, orders, invoices, receipts, filled-out forms (e.g., insurance applications and completed tests), securities, etc. Processing of documents, however, has progressed such that many documents have a digital life in addition to a physical printed existence. In industries where a large number of documents are necessary, document processing management becomes very important. Document processing management can normally be broken into three stages: front-end generation of the document, usage of the document, and back-end processing of the used document. The content of each stage may vary according to industry.
During the front-end generation of documents, the document generation data exists as a variety of text (e.g., ASCII), graphics, and images, which is often extracted from multiple databases. The data can be organized in a variety of ways. In some cases, proprietary formats and systems may be used that are not publicly accessible. Where documents are printed, many printers accept text formats such as PostScript and create the print data on-the-fly with no storage of data. Alternatively, some printers create the print data and temporarily store it in one or more buffers. This data, however, is never used beyond the front-end generating stage. In other cases, some systems use a post-printer camera or quality check system that records the printed documents after printing by making another image of them. This data, however, is never used beyond the front-end generating stage.
Archival requirements for the printed documents may vary, for example, by industry. One illustrative industry in which document processing and archiving has a significant role is the banking and finance industry. In this industry, important data such as customer statements or check images are usually archived so that a record of what was generated exists. Archived documents in some form are often made available to customer support operations, so that customer support representatives can review what was sent to the customer, received from the customer or returned to the customer (e.g., a cancelled check). Archiving of these documents may include saving the text data, or the print-ready pages, or a combination (e.g., some print-ready pages with selected text data is common in repositories such as IBM's ContentManager, OnDemand). In contrast, the pixel data per page, i.e., the actual image of which pixels were used on the page, may not be saved even temporarily.
In order to facilitate processing and archival storage during back-end processing of the printed documents, i.e., after their intended use, many organizations image used documents that are received by scanning them. For example, in the insurance industry, some companies scan all received correspondence. The letters, application forms, reports, etc., are then handled as images for processing. The information printed on these documents is often converted to text data by optical character recognition (OCR) programs to make text searching and data mining feasible and to assist in indexing. When OCR is not used, labor intensive and time-consuming manual keying-in of the data may be implemented. In any event, significant time and effort is oftentimes expended indexing, reconciling, error checking, and fraud detecting as part of back-end processing of used documents.
One problem with conventional approaches to document processing management is that front-end generating data is not used with back-end processing data. This may be the case even when the front-end document generating data exists in the same organization as the back-end processing. More often, however, the problem exists because the front-end and back-end processes do not exist in the same organization. For example, in the banking and finance industry, checks can be issued by a large number of institutions and cashed by an equally large and independent number of institutions. For the clearing of checks, banking institutions often overnight express CD-ROMs of the check images to their large commercial customers. Some institutions manually compare the checks to their text data. In this case, unless the cashing bank happened to have written the check, it is highly unlikely to have access to the front-end processing data for detecting errors. There is no current service that prints checks and leverages the original data to ensure the accuracy of the checks cashed by comparing each cashed check to the check that was printed. As another example, insurance companies that receive and scan used documents oftentimes have documents generated by an outside third party such that the original information used to print the documents is not accessible. In the past there has been no way to link up the front-end generation of the documents with the back-end scanned versions at the receiver when these operations happen in different companies.
Another example industry in which separation of front-end generation and back-end processing creates problems is the testing industry. In this industry, test booklets are often printed in sections and assembled such that each test in a group has uniquely ordered questions. After use, the test booklets are split apart into sections again, scanned, and individually sent to scorers. This process is time consuming and tedious. In addition, paper test booklets are archived in warehouses for various amounts of time in case scoring is questioned. Finding a particular used test booklet in the warehouse is also time consuming and labor intensive. Currently, no way to link up the front-end generation of tests with the back-end scoring and archiving processes exists.
In view of the foregoing, there is a need in the art for joining front-end and back-end processing of documents.