I. Field of the Invention
The present invention relates generally to systems and methods for scoring constructed responses generated by one or more students in response to one or more prompts and, more particularly, to systems and methods that reduce the amount of hand-scoring needed to score short-answer constructed responses.
II. Discussion of the Background Art
Schools in the United States and other parts of the world have been administering standardized tests for many years. In practice, standardized tests often include some combination of multiple choice questions and questions requiring a written response, such as an essay or a constructed response. The term “constructed response,” as used herein, refers to a short text string containing a limited amount of highly specific impersonal information. The number of distinct correct responses is very limited, but there are many ways to construct a correct response. An essay differs from a constructed response in that it is a literary composition on a particular theme or subject, in prose and generally analytic, speculative, or interpretative in nature and typically consists of and is influenced by a student's own personal thoughts, feelings, ideas, preferences, and knowledge, and seeks to be one of an infinite number of highly variable “correct” responses.
Multiple choice questions are a convenient way to assess achievement or ability in part because an answer is chosen from a finite set of pre-constructed responses and the answer can be scored quickly and accurately using automated techniques. However, because students are presented with pre-constructed responses, it is possible for a student to guess the right answer without having a requisite level of achievement or ability. Constructed responses require the student to answer by constructing a response; and, therefore, the correct answer cannot be guessed from a set of options. Constructed responses are usually graded by hand because of the difficulty in accounting for all the various ways in which a response may be constructed.
Hand scoring constructed responses is time-consuming and expensive. Graders use rubrics (rules or guidelines) and anchor papers (examples of papers for each possible score) to determine the grade to be given to a response. The process can take several minutes for each response. In addition, it is well known that agreement between scorers can vary depending on the test item, rubric, and the scoring session. For this reason, some states pay to have two or more scorers read each paper to improve reliability, though this does not eliminate the possibility of assigning an incorrect score. Automated grading systems have been proposed to reduce the time and expense associated with scoring constructed responses, and to ensure scoring consistency. To date, only systems that score writing essays (as compared to short-answer constructed response items) have provided an acceptable degree of accuracy in comparison with hand scoring.