The present invention relates to document recognition systems. More specifically, embodiments of the present invention relate to a document recognition system that creates a document signature using static and variable strings.
In some instances, document recognition involves reading or scanning a hard-copy document (e.g., a paper document) to generate an image of the document and, thereafter, a computer-based process of analyzing the document image in order to identify the document in some manner. Often, recognition involves analyzing the document to determine whether the document belongs to a previously known type, kind, or class of documents. Document recognition is sometimes implemented as part of a workflow process such as invoice processing.
Efficient document recognition of invoices (or other documents) can reduce the labor costs of a business as well as improve business document processing time and efficiency. A number of methods are currently available for processing business documents. Pattern recognition is one such method. Pattern recognition can include identification of line segments in a document. Optical character recognition (OCR) is related to pattern recognition and can also be used. Regardless of the specific technologies or methodologies employed, current document recognition systems often require large libraries of lookup tables or predefined business documents in order to perform document recognition effectively.