A standardized test consists of questions or tasks that are given to students for completion under similar testing conditions, thus enabling comparisons among the students' performance. The term “standardized test” is used here expansively to denote assessments of various sorts.
Standardized tests are employed in a wide variety of ways in our society. For example, standardized test results play an important role in certain employers' decisions concerning hiring and promotion decisions, certain government agencies' determinations of whether to license professionals, and certain educational institutions' admissions decisions.
In addition, standardized tests are increasingly used within K-12 education as a means of assessing students' progress in various disciplines such as math, reading, social studies, and language arts. At least 48 states now assess students' reading and mathematics skills at the elementary, middle, and high school levels. Increasingly, the results of end-of-year tests are seen as an important way to measure educational progress at the state and local level, and the consequences of these tests are growing; for example, in some places, students may be held back from advancing to the next grade based on their standardized test results.
A particular standardized test (a “Test”) is designed to measure the performance of a test-taker in a given field or domain (a “Subject”). Subjects could include an academic discipline (e.g., college mathematics); (b) a professional field (e.g., tax accounting); or (c) a practical endeavor (e.g., driving). Individual test-takers are known as “Students”, and aggregations of Students are known as “Groups”. Groups can exist at different levels of hierarchy, such as the class or school level, or can be based on non-hierarchical relationships such as shared ethnicity, etc.
A Test is designed to measure Students' abilities to carry out certain tasks in that Subject and/or Students' knowledge about that Subject (“Skills”, sometimes known in the literature as “Attributes” or “Rules”). A Test is designed for test-takers within a given ability range or at a certain point within a course of study, meaning that the Test has a certain “Level” associated with it.
All of the information that can be said about a Student or Group, based on its performance on a Test, is known as the “Test Results”. Test Results is a broad concept, which can encompass both numerical and evaluation statements, either about Students' overall performance or performance in a specific Skill. Some examples of Test Results are as follows: a list of Student's total scores on a monthly diagnostic test; a chart tracking the average performance of girls in a school from one year to the next on the yearly math exam; or statements of “Needs Improvement”, “Good Work”, or “Review Vectors” associated with Students' performance in the various Skills assessed on a science exam.
Users are the individuals that use the Test Results for a given purpose. Some typical Users include Educators, Parents, and Students. For example, Educators (as defined below) may use Test results to guide instruction of Students, or to evaluate the overall progress of a class or school. Likewise, Parents may use Test Results by following up with their child's teacher to make sure the child receives additional instruction in a Skill.
Here, Educator is an extremely general term and can refer to any individual associated with the training or instructing of Students. For example, in the K-12 context, an Educator can include without limitation teachers, tutors, reading specialists, remediation specialists, or administrators of various kinds (such as a school principal, superintendent, or state education official). In other contexts, an Educator could be a job trainer, flight trainer, or professor, for example.
An organization that wants to process and display the results of a Test to a certain group of Users is known as the “Client”. For example, a Client may be a local school district that wishes to process and display the students' results on a statewide standardized exam. In other cases, the Client could be the test publishing organization, which wants a way to effectively process and display the Test Results. Here, more generally, the term Client is used to refer to the organization or organizations that may provide inputs into the system, such as lists of Students and lists of Students' responses to tasks on the Test.
It is important to note that standardized tests have different reference methods. For example, some standardized tests are “Norm-referenced”, meaning that an individual student's performance is implicitly compared against the performance of other students. Other standardized tests are “Criterion-referenced”, meaning that students' performance is implicitly compared against performance standards in that Subject as established by pedagogical experts in that Subject. Criterion-referencing is common in a number of contexts, including licensure and certification exams within the professions, K-12 accountability measures, college entrance exams, and elsewhere. According to Gandal in The State of States: A Progress Report (1999), increasing numbers of states are moving away from norm-referenced tests that compare students to national averages and toward criterion-referenced exams that measure students' ability to master standards-based material.
Standardized tests may be given only at the end of a course of study (such as at the end of a grade in school), or they may be given at various times throughout the year to measure students' progress. Their purposes may be evaluative or diagnostic or some combination of both.
Also, standardized tests may be structured in various ways. For example, a standardized test consists of one or more questions (known here as “Items”). Items may be of one or more types; for example, two common types of Items are multiple choice Items (which require a student to choose the best response among various possible answers) and constructed response Items (which require a student to compose the student's own answer). Other types of Items could include tasks of other natures, in other forms, delivered by other media.
Finally, standardized tests may be administered and scored in diverse ways. For example, they can be administered in various media, such as in paper and through a networked computer. Scoring of Items can be performed manually, electronically, or in some combination thereof. Items are scored with respect to a “Scoring Guide” for that Test, which may include an answer key for scoring multiple choice questions and/or rubric guides for scoring essays and other types of open-ended questions.
Innovations in standardized testing continue to reshape the field, particularly within the field of psychometrics, the science of interpreting test results by means of statistical and cognitive models. Now some types of standardized tests involve a testing process in which different Students are given different Items and complex scoring methodologies are used to generate aggregate (and in some cases, Skill-specific) Scores that are comparable across Students.
Regardless of their subject, level, reference method, design, structure, administration, or scoring method, all standardized tests share a common feature: Test Results are received by individuals who want to utilize the results in certain ways. In some cases, the Users are only interested in aggregate information that specifies how well students performed overall on a test. For example, a school administrator may want to review the school's mean student performance on a test in a given subject from year to year, as one method of evaluating the school's progress in that subject over time.
Often, however, Users want significant diagnostic information that goes beyond students' overall test performance. For example, an Educator may want to know how well a given student performed on a particular Skill examined on a given Test. The Educator also may be interested in how well certain groups of students performed on particular Skills. Furthermore, the Educator may want to understand what instructional strategy is most appropriate for individuals and groups, based on results from that test. Other recipients of Test Results (such as Students or Parents) often desire similar information.
In order for Test Results to be useful to Users, Test Result information should be processed and displayed in a manner that permits Users to understand the results, navigate between different displays of the results, and take action based on the results. Current methods of processing and displaying Test Result information have various flaws: For example, the Skill categories in which the results are displayed are not useful, and the methods used to generate Skill-Item associations are crude. Likewise, the conclusions that are reached about individual Students and Groups, based on the Results, are often difficult for Users to understand and are based on sub optimal methods of generating statistical conclusions. Finally, the display of the Test Results itself leaves much to be desired, as current methods (such as U.S. Pat. Nos. 5,934,909 and 6,270,351) fail to enable Users to see Test Results and related instructional materials in a way that facilitates action.
These problems with the existing methods of processing and displaying Test Results are endemic across all forms of standardized testing, including such diverse fields as corporate training and higher education. Indeed, in the K-12 context, various experts have sharply critiqued current systems and methods for processing and displaying Test Result information. For example, the National Educational Goals Panel (1998) has concluded that printed reports given to Parents about the Test Results in K-12 “are not very clear”. As a result, the nonprofit organization Public Agenda has concluded in its Reality Check (1998) that Parents “appear to lack a solid grasp of their schools' academic goals,” as well as the “information essential to properly evaluate how well their children and schools are doing.”
Similarly, even though many Educators in grades K-12 are told to use data to inform their instructional practice, they are not positioned to do so because current systems and methods do not render the information meaningful or comprehensible. As researchers at the UCLA Center for Research on Evaluation, Standards, and Student Testing have concluded, “The practice of applying large-scale data to classroom practice is virtually nonexistent” (2001).
The system and method described in this invention address deficiencies in the current methods for the processing and displaying of Test Results, with application to all forms of standardized testing.