1. Field of Invention
The present invention relates in general to the field of education and more specifically to systems and methods for conducting test assessment.
2. Description of the Background Art
Mechanisms for creating and evaluating testing materials have remained largely unchanged for many decades. In conducting standardized testing, for example, test preparation experts of a test producer create a test as series of questions or test “items” and corresponding selectable test item responses. The items and an item response grid are then typeset, bound in a test booklet, and—close in time to actually conducting the testing—the booklets are delivered to schools or other test sites.
The test booklets are then distributed by the officiators to test proctors. Within each classroom or other test site location, a proctor distributes the test booklets, instructs the test subjects (e.g., students) respecting testing protocol, and initiates, times and proctors the testing. It is further presumed that the students read the items, select responses to the test items and fill-in the corresponding test response grid. The completed tests are then collected in each classroom, combined with those of other test site classrooms, combined with completed tests from other test sites, and then delivered to an assessment service provider—often the suppliers of the testing booklets. (Scratch paper is also often distributed for student calculations or other response determination, which scratch paper is collected, accounted for and discarded.)
Test assessment experts then individually grade each grid of each student. Because a grid of just a few detachable pages is used, the grading may be conducted by hand or by machine. The grids may, for example, be detached, scanned and a grid reader may identify, for each row, (response item) whether the correct grid column element is blackened. If so, then credit is given for a correct answer. If not, then no credit is given or credit may be deducted for an incorrect grid column item that is blackened.
Unfortunately, the above mechanism is extremely time consuming and laborious. For example, each test item must be manually created in a manner that is likely to be understood, sufficient for suitably testing a particular aspect of student knowledge and readily ordered and answered to that end. To make matters more difficult, more than one set of items may be needed to better assure fair and uncompromised testing. The collection and grading process may also take months to complete.
The above mechanism is also limited in scope and may prove to be an inaccurate assessor of student knowledge. For example, since testing other than selected answer (grid based) must be accomplished entirely by hand, testing is typically limited to selected answer type testing. The accuracy of such testing may further be compromised due to inherent ethnic or national bias, responses that may be difficult to identify, interpret or distinguish, or otherwise fail to assess an intended knowledge (or teaching) aspect. Hand grading is also subject to human perception inaccuracies or inconsistencies. Conventional assessment is also limited to determining a score and fails to consider identifying actual student (or instructor or institutional) knowledge, understanding or skill factors that may be used for bestowing partial credit or providing further learning, further evaluation or other potential benefits.
The present inventor has also identified particular problems relating to constrained constructed response item creation and assessment, such as with graphic, markup, multimedia or other responses, as well as other aspects of assessment and/or learning support. For example, the potential for subject error due to item misinterpretation, mis-calculation, infirmity influences, mis-execution and/or human error make it is extremely difficult to manually assess them in an accurate or consistent manner within a standard or otherwise acceptable margin of error. Current mechanisms also render infeasible the prospect of assessing other learning information other than raw score, among still further difficulties.
Accordingly, there is a need for automated constrained constructed response item assessment systems and methods that enable one or more of the above and/or other problems of conventional mechanisms to be avoided.