Systems and methods for metadata extraction, generation, mapping and execution have been described in the computer arts with varying degrees of success.
For instance, U.S. Patent Publication US 2013/0294694 (Zhang) teaches a method for zone-based metadata extraction that allows users to select zones in a text document in order to extract metadata using optical character recognition (OCR) on said zones and to store the metadata in a database.
U.S Patent Publication US2008/0162603 (Garg) teaches a method whereby an image document is turned into a text document. According to Garg, the user creates a template specifying the portions of an image document to be OCR'd. The system then receives a document, extracts textual metadata from said document and stores said metadata on a storage.
Similarly, U.S. Pat. No. 8,693,790 (Jiang) teaches a form template definition method. The form template definition method comprises a cell extraction step of analyzing an image thereby extracting one or more cells from the image. The method further includes a cell classification step of classifying the extracted cells and a cell attribute definition step of defining attributes of the extracted cells class by class. The inventive step behind Jiang is related to automatically copying the common attributes of a first cell to other cells.
However, none of the references provide an architecture to perform pre-processing or post-processing of a given document in order to perform actions based on a document type. The present disclosure overcomes the limitations found in the relevant art.