Many standardized tests require a test taker to provide a response to a constructed response question. A constructed response question may contain no response alternatives (like a multiple choice question) and require the test taker to self-generate a response, such as an essay question. For example, high school students may take Advanced Placement (AP) examinations that, if successful, may permit the student to receive college credit. As another example, law school graduates may take one or more state bar examinations to become a licensed attorney in that state. Both the AP examinations and the bar examinations may include constructive response questions, such as essay questions. Constructed response questions may also require the test taker to provide a spoken response, such as during a speech examination, or provide a pictorial response, such as during an architectural examination.
Responses to these constructed response questions are typically graded by one or more human graders or evaluators. It is important that the grading of these responses be efficient and consistent. The effort to grade the responses to constructed response questions can be enormous, especially when a question is graded by multiple evaluators. Many testing programs using constructed response questions have instituted a requirement that each question be graded by two different evaluators and the scores compared to assure that if the scores differ, the difference is within a predefined range. Scores that differ by more than the predefined range may be graded by a third evaluator to resolve the discrepancy. Alternatively, the two original evaluators may work together to resolve the discrepancy or the scores may be averaged.
Computer-based adaptive testing methods select and deliver questions to test takers based on an ongoing dynamic estimate of a test taker's performance level taken from that test taker's previous responses. For example, a test taker may receive a next question based on the test taker's response to a previous question. If the test taker answers the previous question correctly, the computer may deliver a harder question to the test taker. Conversely, if the test taker answers the previous question incorrectly, the computer may deliver an easier question to the test taker. As a result, the computer can determine the proficiency of the test taker with a fewer number of multiple choice questions than with a standard multiple choice examination.
It would be desirable to make the process of grading responses to constructed response questions more efficient without sacrificing the consistency of the scores. By using adaptive scoring to grade responses, the process of grading examinations with constructed response questions may be performed more efficiently without sacrificing consistency of the scores.