The present invention relates to the automated scoring of so-called “bubble” forms or the like, whereby each of a multiplicity of end users such as students, voters, and questionnaire respondents, enters marks on a standard form associated with the individual, to indicate the specific choices from among those offered on the form. The marks are typically entered by darkening (filling in) preprinted circles or ellipses with a pencil. Inevitably, some of the respondents apply very light pressure on the pencil and therefore produce a very light mark. The marks made by other respondents may either extend out of the designated target, or partially within and partially outside of the target. In other cases, a stray mark may inadvertently be made near a target. These deviations give rise to the need for interpretation of the intent of the user. This is in addition to deviations that arise from imperfect printing of the form, and misalignment or bending of the form as it passes through the scoring system.
Under the present state of the art, bubble forms are processed with highly specialized scanners that sense the marked state of the bubble targets using a fixed array of LED sensors. These scanners do not work by producing an electronic image reproduction of the contents of the forms; they merely sense the darkness levels of certain predefined fixed locations of the forms. On a form page of dimension 8½×11 inches, the typical bubble density is approximately six bubbles per inch in each of the page dimensions. This is similar to the LED sensor density and thus, one LED sensor is associated with each column of bubble locations. In order to achieve and maintain their accuracy rates, these scanners require extremely high quality paper with very precise printing and cutting. Production of such forms is very expensive and is usually available only from a single source, namely, the same company that produces the LED scanner.
U.S. Pat. No. 6,970,267 discloses a system and method that uses electronic image capturing technology in conjunction with specialized image processing techniques for performing the scoring. That invention works with an image capture scanner that produces an electronic reproduction of each form. The electronic images are processed to produce scoring results that achieve an accuracy rate equal to or better than those achieved by the LED scanners. Various makes and models of scanners may be used, but the most accurate results are achieved when using scanners that produce gray-scale image output. It is therefore not necessary to use expensive paper and printing processes as is required by the LED scanners. Thus, paper and printing costs can be greatly reduced without sacrificing accuracy.
The master printed form is scanned and processed according to a forms definition program to produce a virtual form file comprising a virtual layout of the significant regions such as bubble targets of printed material on the form, on a virtual coordinate system. Production forms that have been marked by subjects (e.g., students, voters, survey respondents, etc.), are then scanned to produce a marked form file of gray scale darkness values for each marked form. The marked form file and the virtual form file, are compared and processed to determine the location and spatial relationships of the marks on the marked form, in relation to the virtual coordinate system of the virtual form. The raw scan of each marked form is also processed to determine whether darkened areas on the marked form image should be interpreted as intentional responses from the subject, at the virtual coordinates where targets are located on the virtual form.
The master preprinted form preferably includes a plurality, typically four, preprinted reference marks at, e.g., the corners, as do the forms to be marked by the subjects. Because the reference marks are relatively accurately positioned on both the master pre-printed form and the preprinted form given to each subject, the coordinates of the reference marks on both the preprinted forms and in the virtual coordinate system are established with a high degree of accuracy. To the extent the coordinates of the reference marks in the marked form file differ form the coordinates of the reference mark of the virtual form (or, master template) in the virtual coordinate system, adjustment can be made for the deviations arising from skew, shift, stretch (scale), and slant, such that the coordinates associated with each target on the scanned marked form file can be appropriately offset or adjusted relative to the coordinates of the master template form in the virtual coordinate system. In this manner, marks made by the subject on the form, as represented by gray scale darkness values in the marked form file, can be better associated with a target location and employed in various interpreting rules to confirm whether a score for a particular target should have one or the other of a binary value, e.g., “intentionally marked” or “intentionally left blank”. The score for a target or field could instead be indeterminate due to excessive uncertainty or to an invalid relation with other targets or fields. The term scored value should be understood in the most general sense, as indicative of an evaluation made of an active, passive, positive or negative, indeterminate or invalid response to e.g., a bubble.
Furthermore, the relatively high resolution of the scanned image, e.g., hundreds of positions (pixels) per inch, permits gray scale darkness sampling in a region or zone surrounding each target location (as offset or adjusted) to ascertain the most likely center of a candidate mark, and to ascribe a level of darkness to the candidate mark, which in combination with the imputed distance from the center to the adjusted target location, provide inputs to a logic operation for concluding whether or not a candidate mark is of sufficient darkness and close enough to the expected location, to warrant an interpretation that the mark is an intentional response.
Many of the interpretive variables are under the control of the user, i.e., the entity having the responsibility for scoring the marked forms. These interpretive rules are specified for the virtual or master template form file obtained from the master preprinted form, before the marked forms are scanned and scored. The variables associated with defining the virtual form or master template, and with interpreting candidate marks, are preferably implemented by such user, with the aid of a graphical user interface whereby various tools can be employed on the virtual layout of the preprinted form on the virtual coordinate system.
The overall system and method described in U.S. Pat. No. 6,970,267 are hereby incorporated by reference as disclosing the preferred context of the present invention. This prior system and method are well suited for scoring marks on printed forms of a type in which the reference marks are made with dark, non-dropout ink, and of sufficient quantity and placement to assure good matching between the implied coordinate system of the marked form to be scored and the virtual coordinate system of master form. Although forms specifically designed for use with that system as well as many other commercially available forms have suitable reference marks, some users have a huge inventory or long-term purchase commitment for forms that are suitable for LED scanners, but not digital scanning. The reference marks and/or timing marks along the edges of these forms could be printed in colors or dropout ink that might not be readily and quickly interpreted by a digital system that relies only on scanned or computed gray scale values for the non-dropout reference marks and the pencil marks or the like to be scored.
The overall method of the incorporated patent has innovative aspects that are not dependent on whether the targets are dropped out of a scan of the marked form, such as but not limited to the determination of a virtual coordinate system, the accurate location of reference marks and targets, and the logic for interpreting marked areas associated with targets. However, this prior method and system can be more universally effective, if the marked forms can be read or captured with a non-dropout scanner.
The machine-readable form typically has a field bounded by known coordinates on the form. The form may have broad stripes of alternating shading or colors, so that some of fields or portions of a given field have a printed background color. The field contains a group of spaced apart printed outline targets of known size and nominal location within the field, each target outline surrounding a printed symbol. At least some of the printed target can overly the background color. The form to be scored is designed to elicit at least one mark per field, although the subject may leave some fields unmarked, or mistakenly enter two marks. In any event, a form to be scored is expected to have at least one field where the subject has entered one mark on or near a target in the field.
The problem addressed by the present invention, is that, on the one hand, digital capture of all the print and marks on a form to be scored, provides the most potentially usable information for confirming the locations of all the targets, but the printed matter interferes with the identification and validation of the marks entered by the subject. This problem is exacerbated when a particular scoring system purchased and operated by a scoring contractor or service, must be capable of handling a wide variety of forms, printed in a variety of inks, with a variety of reference (alignment) or timing marks, and with a variety of sizes, shapes, and associated characters or symbols.