For evaluating the performance of production forms data capture systems, it has been customary to have human data entry personnel, referred to as “keyers”, sample original captured data fields according to prescribed protocols for determining the correct answers (i.e., “truth”) of production data. For example, the “truth” of the production data can be operatively determined to a desired statistical accuracy by having “keyers” verify (i.e., “double key”) each others answers. The time and effort required for evaluating the “truth” of large quantities of production data to desired statistical accuracy can be prohibitively expensive, resulting in compromises among the amount of production data evaluated and the accuracy with which the production data is evaluated.
As a goal set among certain embodiments of the invention, software automation and good statistical design is used to reduce the human effort by as much as 40 times while obtaining high quality “truth” for evaluating production data to desired statistical accuracy. Once the “truth” of the production data is known, the production data can be scored using a variety of correctness criteria appropriate for the application, including categorical groupings of “hard match” (i.e., exact) comparisons and “soft match” (i.e., approximate) comparisons of related meanings.