The invention relates generally to systems and methods for assessment of constructed responses to test questions. More particularly, the invention relates to systems and methods for providing a highly scaleable and customizable consolidated framework for the intake, processing, annotation, benchmarking and scoring of media-rich candidate produced constructed responses to assessment prompts or other stimuli.
Computer systems have been developed for the assessment of open-ended test responses such as essay responses. These open-ended responses are often referred to as constructed responses (CRs). CRs are not limited to handwritten or typewritten text but may also include graphics, videotape performances, audio responses, and/or other forms of responses in accordance with the type of testing being conducted. Systems are known for use in assisting human graders in scoring such CRs generated during the administration of examinations such as the SAT(copyright), the LSAT(copyright), the GMAT(copyright), the National Board for Professional Teaching Standards(copyright) (NBPTS), the Test of English as a Foreign Language (TOEFL(copyright)), and the like. For example, the closest known such prior art system to the present invention is the system described in U.S. Pat. No. 5,991,595, assigned to the same assignee as the present invention. The contents of that application are hereby incorporated herein by reference.
U.S. Pat. No. 5,991,595 describes an online scoring network (OSN) for scoring constructed responses and also provides methods for training, monitoring, and evaluating human rater""s scoring of such constructed responses. The OSN system described therein is characterized in part by the use of workfolders that are used to transmit a number of CRs at one time to a reader or rater for evaluation and for receiving scores from the reader for the number of CRs at one time. A processing unit organizes a number of associated CRs into an electronic work folder for distribution to raters located at a number of local or remote rater stations. The raters assess the CRs in the work folder in any order and return the work folder upon completion. Each rater may be assigned to various test sections based on the rater""s qualification status, and the work folders with the appropriate categories of CRs for that rater are distributed to that rater based on the rater""s qualification status.
Conventional prior art systems typically store and utilize data associated with a candidate or the candidate""s CR, such as the response itself, the prompt, topic, or question to which the candidate or test-taker responded, the training materials used for that topic, the scoring procedures for the response, the score data, and other information, based on a characterization of the state or status of that data. For certain kinds of assessments or assessment related activities, particularly those involving complex content domain characterizations and media-rich candidate CRs, which require more flexible management and distribution of material, there exists a need to employ a different conceptualization of how this disparate information is stored, combined and utilized.
Prior art systems have also been designed to support an explicit categorization of constructed responses by their intended use (e.g., calibration, monitoring, training, production scoring, etc.). As a consequence, prior art systems have been designed such that the Constructed Responses so categorized must be physically moved from one database to another, or from one table structure to another as their disposition or use changes (see, e.g., FIG. 2 of U.S. Pat. No. 5,991,595 and the accompanying textual description). While this confers some advantage in a workfolder-based system by allowing workfolders to contain constructed responses from one database at a time, in non-workfolder-based systems it can prevent, make difficult, or delay the smooth transition of scoring elements from one status/state to another. Prior art systems allow one to categorize a constructed response by its use, but this categorization does not capture the process flow or work flow associated with the constructed response and its relationship to other elements of the scoring system. A system is desired that integrates the constructed responses with these other elements of the scoring system and as such eliminates the need to categorize Constructed Responses in this manner, eliminates the need to physically separate data structures associated with differently-categorized Constructed Responses, and eliminates the system overhead required to do so, without loss of the capability to distinguish the disposition of one constructed response from another.
In most prior art scoring systems, the test-taker-contributed material (the CR) is implicitly treated as the fundamental unit of work, the xe2x80x9cthing-to-be-scored.xe2x80x9d In the system of U.S. Pat. No. 5,991,595, the CR is still the fundamental unit of work, even though those units are bundled into workfolders (collections of CRs) for distribution purposes. In conventional prior art systems there is no distinction made (nor mechanism to enable such a distinction) between the kind of CR something is, and the way that particular unit of work should be treated. Further, there is no distinction made (nor mechanism to enable such a distinction) between the CR as test-taker-contributed content and the CR as the carrier of state or status informationxe2x80x94the status of a particular piece of test-taker-contributed material is inferred by the system from other information. A system is desired that can distinguish between the CR as test-taker-contributed content and the CR as the carrier of state or status information.
The present invention is designed to address these needs in the art.
The present invention meets the afore-mentioned and other needs in the art by providing a web-based Java Servlet Application/Applet system designed to support the evaluation of complex performance assessments of various types. The unified system dramatically reduces the number of touch points and handoffs between systems compared to prior art scoring systems and dramatically increases the administrator""s ability to track candidates and their responses from test center appointment through benchmarking and scoring.
As stated above, prior art systems treat the disparate data elements associated with the scoring activity as separate and separable functional components, usually linked through traditional flat relational database structures. Because these linkages are codified in this manner, a significant level of flexibility is sacrificed, both in terms of the ease with which data elements can be combined and recombined based on changing business needs, and the ease with which new kinds of relationships can be established. The Consolidated Online Assessment System (COLA System) of the present invention overcomes these limitations through the creation, manipulation, and distribution of an objected-oriented paradigm that represents the scoring and related activities as a unified and integrated family of loosely coupled objects, most notably the Case (referred to herein as the xe2x80x9cCOLA Casexe2x80x9d ) which represents a state-machine that replaces the xe2x80x9cCRxe2x80x9dxe2x80x94test-taker contributed contentxe2x80x94as the unit of work, the Scoring Model and associated properties which encapsulates the business rules associated with what actions are appropriate or required for a unit of work, the Responses to the unit of work which represent the test-taker-contributed content (e.g., essays or other text-based responses, audio responses, digitized video responses, scanned images, diagrams, lessons plans, etc., and ties that content to its creator), and the Distinct Scorable Unit (DSU) which represents a tree-based mechanism that connects and provides inheritability for the other primary system objects.
The present invention is designed to distinguish between the thing-to-be-scored as a unit of work and the content of the thing, between the unit of work and the rules for determining the disposition of that piece of work, and between the status or state of a piece of work and the content (or scores) associated with that work. Distinguishing these elements in the manner of the present invention makes it such that it simply does not matter any longer what the particular content of a particular piece of work is to the rest of the system. For example, the design of the present invention makes it unnecessary to specify that *this* content received *this* score. Instead, what matters is that a particular Case is in a xe2x80x9cSCOREDxe2x80x9d state, that it represents xe2x80x9cthisxe2x80x9d content, associated with xe2x80x9cthisxe2x80x9d DSU, which in turn indicates that it was scored using xe2x80x9cthesexe2x80x9d rules. By reconceptualizing the basic unit of work and by creating and connecting to this work the other entities described above, the end result is a system and methods that is extraordinarily flexible and scaleable in its support for many and varied content or knowledge domains, many and varied models for scoring, evaluating, or manipulating units of work, and many and varied kinds of test-taker-contributed material.
Those skilled in the art will appreciate that the COLA System of the invention does not suffer from the same limitations of the prior art that were addressed by the OSN System of U.S. Pat. No. 5,991,595xe2x80x94that is, wasted rater time and the potential business need to revise scores. The COLA System backend is highly efficient, and the COLA System front-to-back-to-front communication protocol is lightweight, which overcomes stated limitations of non-workfolder-based prior art systems. The business need for revising scores is addressed in the present invention through a COLA Case state change and the application of scoring model properties appropriate to that state.
The COLA system design further provides, among other features, integrated messaging, the capability for online assessor timesheets; improved management of handwritten candidate responses; automated identification and distribution of cases requiring more than one score; online benchmark case, training case, and recalibration case selection; web-based reporting on a variety of information important to the scoring process, including the pace of scoring and the status of every eligible candidate; and vastly improved system administration support. The development of new interfaces to connect the COLA with a main repository for candidate responses, as well as new interfaces for data transfer between the test administrator organization and other organizations also increases the overall reliability and utility of the COLA system.
Those skilled in the art will appreciate that the COLA framework is not limited to essay scoring, although that is the currently preferred embodiment. The framework of the COLA System provides a more general means to provide evaluative functions for users. The core functions in the COLA System can be redeployed, e.g., to provide for formative assessment, mentoring, or employee/teacher/student performance evaluation.