Today, plain-paper Optical Mark Recognition (OMR) technology uses pattern recognition to automatically find response bubbles and to determine whether or not the response bubbles are filled. There are two types of commercially available plain-paper technologies in use today. One technology prints registration marks on the forms. The other technology does not require the use of any preprinted registration marks but instead depends upon the location of the response bubbles for registration. Using pattern-recognition technology, both technologies automatically register the form, find response bubbles, and determine whether or not the response bubbles are filled. These technologies allow mark-sense forms to be designed using standard, commercially available word processing or graphics packages and to be printed by any quality printer on most papers. Any pencil or pen may be used to mark the forms, and the forms can be read using any device that can capture and produce an image, such as an image scanner or digital camera.
In order to recognize OMR forms with current plain-paper technology, such as Remark Office OMR®, available from Gravic, Inc., Malvern, Pa., response bubble attributes, such as the location, size, density and number of contiguous pels comprising the outline of the response bubbles, are known beforehand. This is typically accomplished by creating a template for a response form using a baseline image of an unfilled response form which defines the locations of the response bubbles, the size of the response bubbles within each region, the average fill value or density and other attributes of the response bubbles within each region. This information defines the characteristics of the response bubbles. When processing completed forms, the located response bubbles on the completed forms are compared to their corresponding response bubble definitions in the form template to determine whether or not the response bubbles are filled. This comparison will typically involve a number of factors including the size of the response bubble, the location of the response bubble, and the density of the response bubble. A response bubble fill metric incorporating these factors is then calculated and used to determine if the response bubble is filled. If the response bubble fill metric value of the response bubble being recognized is higher than the predetermined fill threshold, then the response bubble being recognized is considered to be filled.
However, this approach for determining whether or not a response bubble is filled relies heavily on making sure that the baseline attributes of response bubbles, as well as other form elements, such as barcodes, graphic files and text, are consistent across the form template and the scanned in form. Variances in the response bubble and element attributes between the form image used to define the form template and the image of the filled-in response form being processed could cause inaccurate results. Such variances most often occur when there are inconsistencies when printing or scanning response forms, potentially causing the size of the form, the skew of the form or the brightness of the form to be different from the original. If the response bubble attributes of the response bubbles from the original form used to create the template differ significantly from the response bubble attributes of the response bubbles from the forms being recognized, then the comparison between an unfilled response bubble and the response bubbles being recognized is no longer valid and can produce erroneous results.
Differences in form element attributes can be introduced in many different ways. Some possible ways to introduce differences in form element attributes include: using different printers, using different scanners, photocopying the forms, faxing forms, low toner levels, different scanner settings, different software settings and different paper (color, weight, material, or brand).
For example, if a form image being processed is significantly lighter than the form image used to create the template either because of different printers being used or being scanned at different brightness settings, then the response bubbles will have a much lower density attribute when compared to the baseline density attribute from the form template. FIG. 1 illustrates this situation by showing calculated response bubble attribute values, in this case the average density for unfilled response bubbles, and the resulting difference metric value calculated when taking the percent difference from the average density of unfilled response bubbles and the average density of unfilled response bubbles from the baseline image. To calculate these values, the same response form was scanned using a variety of scanner brightness settings, with the image scanned using a 0 brightness setting considered to be the baseline image. Consequently, partially filled response bubbles may not be reported as being filled since their calculated density causes the response bubble fill metric value to fall below the fill threshold. If a form image being processed is significantly darker than the form image used to create the form template, then unfilled response bubbles or very lightly filled response bubbles may be reported as being filled since their calculated density may cause the response bubble fill metric value to exceed the fill threshold. Likewise, if a form image being recognized is condensed or stretched as compared to the original or has other spatial abnormalities, then those corresponding response bubble attributes may be different than the determined baseline response bubble attributes, causing the response bubble fill metric to use potentially invalid comparisons of the attributes of the response bubble. In each case, the problem lies with the rigidity of the precalculated response bubble attributes.
One of the ways currently available to address differences in response bubble attributes and other issues, as illustrated in FIG. 2, is to modify the image or the process used to produce the image in order to create an image that provides better results when recognized. First, the image is scanned, and the scanning software creates a black-and-white image of the filled response form. Then the recognition software determines the form element attributes of the response bubbles, and uses this information to process the data of the response form image. If after reviewing the data, the user determines that the recognition quality is inadequate, the process can be repeated by either adjusting the scanning settings and re-scanning the image or adjusting the image thresholds and producing another black-and-white image from the original scan. Some of the image and threshold settings that can be adjusted include both brightness and contrast.
This can be an iterative process. Each new iteration may produce a different image. The process ends once adequate recognition results were achieved. Optical Mark Recognition is typically performed on black and white images. In order to produce a new black and white image from an existing image, the existing image would typically be either grayscale or color. When converting a grayscale or color image into a black and white image, there are a number of settings that can be adjusted that will impact the resultant black and white image. Another way to produce a new image would be to re-scan the original form for each iteration. Modifying the image either by re-scanning or by converting a grayscale/color image to black and white is a time consuming or CPU-intensive process and can significantly impact the throughput performance of the system. What is needed is a method that can dynamically adjust the baseline values of the response bubble attributes so that the same form image can be processed again for better results without having the added overhead of re-scanning or reconverting an image.