The present disclosure relates to fields of image and document processing. It finds particular application in connection with the processing of different kinds of documents by extracting data from the document to configure a workflow for other like or similarly created documents.
Form processing, in general, involves a process by which information is captured to be entered into data fields for conversion into an electronic format, which oftentimes is at a back office system, for example. Data is entered or captured from various data fields of the forms and the forms themselves are made digital to be saved as images. Afterwards, a hard copy of the data on the document can be scanned in as an image using a scanner. This image is then recognized based on a pre-defined configuration. The data is captured from particular zones and stored in an electronic format. Afterwards, the data is typically transmitted further down the processing chain to a business workflow that utilizes the data for making certain business decisions or other implementations according to the particular business model, such as a loan processing center for determining a loan approval or other business or client workflow purposes.
There are several common issues that are involved in forms processing and in processing documents overall. When performed manually, a tedious amount of effort is put into the task, the data keyed in by the user may result in typos, and many hours of labor result from this lengthy process. If the forms are processed using computer software driven applications, some issues can be resolved and minimized. Automatic form input systems, for example, use different types of recognition software. For example, optical character recognition (OCR) is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text, which is used, for example, to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website. Another example of recognition software is intelligent character recognition (ICR). ICR provides a handwriting recognition engine that allows fonts and different styles of handwriting to be learned by a computer during processing to improve accuracy and recognition levels. This system extends the usefulness of scanning devices for the purpose of document processing, from printed character recognition (a function of OCR) to hand-written matter recognition. With this system, hand-written data can be automatically populated into a back office system, thereby avoiding laborious manual keying and can be more accurate than traditional human data entry. Intelligent word recognition (IWR) is another example that recognizes and extracts printed-handwritten information and cursive handwriting as well.
Automated Forms Processing technology can employ these systems and others such as Optical Mark Recognition (OMR), which scans a document to detect the presence or absence of a mark in predetermined positions, such as with questionnaires or survey forms. However, manual configuration in processing documents and forms continues to be a large task that can be time intensive. In particular, when form documents need to be pre-configured and then populated with the user's data for processing, a large amount of time can be taken to defining many different form layouts for filling in the many different types of user data. A sample form is often needed as well to extract the layout before processing the data on a form accurately.
One approach that has been developed to alleviate some of the manual configuration needed, and improve data extraction accuracy, is to get the user to fill out an electronic version of the form on a computer while dynamically generating a two dimensional (2D) barcode on the form that contains the user-entered data associated with each field on the form. The form is still printed on paper (and typically signed) and then is scanned for processing. Rather than running OCR/ICR or like recognition software on the image, the associated forms processing system reads the 2D barcode that was dynamically added to the form before printing to extract the user data entered for each field.
Other practical issues, however, remain unresolved in the field of form processing. In particular, specific software is often needed on an end user's computer to provide the dynamic barcode generation, which may also require licensing fees associated with the software. Not all users have access to computer systems and devices for this purpose. In addition, limited information can be entered into a barcode, which often ends up being a pointer to a database. If the database is unavailable from the form generation application and/or the form scanning/processing application (e.g., with a firewall, or broken network link), then the link is broken from the database. Currently, the form information is extractable only if it is printed and some information may not be handled easily (e.g., signature and data information that a user adds to the form). Consequently, creating form documents that are filled in by a user and afterwards processing the document allows no, or very little separation between these two stages of form processing.
Accordingly, it is desirable to provide additional methods and systems that allow a greater separation between the creation and encoding of a document and a processing stage, which may be performed at a back office, by automatically configuring a pre-defined workflow for the form, its contents and information to easily further process similar-like forms and move them down a process chain to a different client designated workflow.