1. Field
The present disclosure is generally related to methods and systems for associating text with scanned documents. More specifically, the present disclosure is generally related to methods and systems for identifying and associating text with documents based on manually marked text from a scanned source.
2. Background
Often, it is desirable to differentiate between regions of a document which have been manually marked, for example, with a highlighter pen from regions of a document which have not been highlighted. The term “manually marked” is intended to mean herein that first marks in a document have been differentiated from remaining marks of the document by a region which has a substantially different gray scale than the background or marks of the original document. Such marks can be made, for example, by way of writing instruments such as pens or markers (e.g., highlighters) which produce bright or fluorescent but relatively transparent colors. Alternatively, such marks may be made electronically, such as in a word processing document using a highlighting or marking option.
A variety of methods have been proposed for the detection of manually marked or highlighted regions in a document. For example, the use of a color scanner has been proposed to detect regions of a document which have been highlighted in a color different than the remainder of the document. Other processing methods utilize detection of an image, which may include shadow, mid-tone, and bright portions. A mid-tone portion may be screened with a low frequency screen to convert the image into a binary form, for example.
Electronic storage of documents has facilitated the handling of large volumes of documents, such as those handled by law firms, hospitals, universities, government institutions, and the like. Typically, the documents are entered into massive storage systems by use of a scanner system that converts text into electronic data. Once the documents are scanned, each document must be manually named or re-named (i.e., requiring user intervention by accessing a file or electronic data) with an unique name or identification number (e.g., docket number, insurance provider and claim number, financial application number, etc.) so that the scanned documents are easily identified when there is a need to retrieve the documents from the computer storage system. However, the need to manually rename each scanned document may be cumbersome and provide undue burden for a user when dealing with heavy scanning application workflows. Additionally, when multiple users are independently scanning documents, each user may utilize a non-uniform method of naming documents. Therefore, the ability to recognize, sort, or locate a document in a computer storage system, for example, may be difficult.