Forms processing is used to recognize and extract text as per defined zones in a form. This is particularly difficult where the text extracted from a page consists of hand-printed or cursively written data. In some cases, using a dictionary, the extracted text is sufficiently clear that it can be recognized as erroneous and corrected automatically, for example, the written word “audible” being extracted and recognized as “audi6le” (i.e., the lowercase “b” is recognized as the number “6”) can be located in a dictionary and corrected because English language words don't have numbers within them, but, in many cases, that is not possible, for example, text containing reference to “60 ml” could be recognized as being ambiguous or anomalous because it is not clear whether the extracted text data is properly “boml” (or a misspelling of some variant) or an intended (but partly undecipherable) number, e.g., “60?1” so that text data must be manually verified by an operator as part of the scanning process, or it will be sent with all such ambiguities and/or anomalies, to a content management repository using a follow-on process to flag such issues for manual review and verification. Such efforts are labor intensive and, consequently, costly.
Thus, there is an ongoing technological problem involving forms processing involving forms containing handwriting.