1. Field of the Invention
The present invention relates to computer adaptive tests. More specifically, the present invention relates to a computer adaptive test that defers application of its ability algorithm for a certain number of questions or items to thereby reduce latency between questions while maintaining statistically accurate test results.
2. Background Information
Traditional methodologies for testing involve providing test-takers with a fixed set of common questions. The test-takers are graded on the test, and relative to each other, based on each individual's accuracy in the nature of the responses to the fixed set of common questions. A fixed test thus presents the same level of difficulty for each test-taker, regardless of the test-taker's individual level of ability. A drawback of such fixed tests is that they tend to provide superior precision for test-takers of medium ability, but less precision for test-takers with extremely high or low ability.
Adaptive tests are based on the principle that more precise test scores can be obtained if the questions are tailored to the ability level of the individual test-taker. This approach stems from the belief that test results are not meaningful if test questions are too difficult or too easy for the particular test-taker. In contrast, more can be understood of a test-taker's true ability level if the questions are more consistent with that ability level.
A computer adaptive test (“CAT”) is a computer implementation of an adaptive testing methodology. Rather than a fixed set of questions that can be posed to a test-taker, a CAT has a pool of available questions at different skill levels from which to iteratively select a question. Typically, the system does not know the particular ability level of the test-taker, and thus selects an initial question (sometimes referred to in the art as an “item”) from a pool of intermediate ability level questions.
The CAT will then grade the test-taker's answer to the question in substantially real time. If the test-taker performs well (either with an accurate absolute answer or with due consideration for partial credit) on the intermediate level question, then the CAT system will consider the test-taker's ability to be superior to its previous estimate and select a new question that is consistent with the perceived higher ability level. Conversely, if the test-taker performs poorly on the intermediate question, the CAT system will consider the test-taker's ability to be inferior to its previous estimate and select a new question that is consistent with the perceived lower ability level. This process continues iteratively until the test is concluded according to some pre-defined criteria.
A drawback of CAT is the manner in which the tests must be administered by the system. For security purposes, the questions cannot be stored locally at the computer terminal at which the test-taker takes the test (“testing terminal”). Rather, the questions are stored on testing servers at some secure remote location and forwarded to the testing computer terminal as needed over a network such as the Internet. Similarly, the algorithm that updates the student's ability level and selects appropriate questions will be at the secure remote location.
This distance between the remote location and the testing terminal generates a delay based on the following steps that must occur after a test-taker answers a question before the next question can be presented to the test-taker:                The testing terminal transmits the answer to the current question to the remote location;        The system at the remote location evaluates the answer for accuracy;        Based on the answer, the system updates the test-taker's ability level pursuant to an algorithm;        A new question is selected based on the updated ability level;        The remote location sends the new question to the testing terminal; and        The testing terminal displays the new question.Based on system traffic and network capabilities, these steps can result in a delay of several seconds between answers and subsequent questions that can distract a test-taker during a period when the test-taker needs to maintain concentration. The delay can be even longer if the questions include any substantial graphics, audio, and/or animation that require additional time to transmit and execute.        
This resultant system latency is of sufficient concern that various techniques have been created to address it. One such attempt to address this drawback has been the use of decision trees to download potential future questions. Specifically, once a current question is provided during a test for the test-taker to answer, there are a finite number of possible outcomes or scores responsive to that current question. For each such possible outcome, the CAT can determine in advance what the next question would be. By way of example, if the question has only two outcomes—a correct or an incorrect answer—the CAT would determine in advance two potential next questions, one for each possible outcome. The remote location sends both possible questions to the testing terminal. Once the test-taker answers the current question, the testing terminal (either alone or in cooperation with the remote location) can determine which of the two “next” questions is proper. The testing terminal will post the selected question on the display, while the other question is effectively discarded.
Thus, for example, when an intermediate question is pending with a correct answer and an incorrect answer, the CAT already has selected and downloaded an “easier” question as the next question if the test-taker gets the answer wrong, and a “harder” question as the next question if the test-taker gets the answer right. Only one of the two will be selected based upon the test-taker's answer to the current question. This “look ahead” methodology can extend several questions down in the decision tree, thus allowing the pre-loading of several sequences of questions. The benefit of such a system is that since the “next” question is already resident on the testing terminal, the next question can be displayed without any significant latency difficulties (although there may still be delay as the testing terminal cooperates with the remote location to determine which of the possible questions should be used).
A drawback of the above approach is that CAT ends up devoting resources and bandwidth to download questions that never end up being used. This wasted bandwidth and resource consumption can become considerable as the CAT downloads questions from further down the decision tree; two items of look ahead (for dichotomous-only items) would require six potential items to be selected and downloaded (one for each possible score of each of the current item, and one for each possible score of each of the next potential items). The problem multiplies based on the number of test-takers who are simultaneously taking the test on the same network (e.g., all of the students at a school taking a particular standardized test). The decision-tree technique quickly degrades in efficacy as it exacerbates rather than abates the problems of network latency, given that many times more items will be downloaded than will be used.
The above methodology also presents security concerns. The correct answer must be transmitted across the network to the testing terminal to finalize the selection of the next question. In addition, questions that are not being used at a particular testing terminal (but which might be used at another) are exposed unnecessarily.
Another attempt to overcome these latency concerns is to bring “clones” of the testing servers to individual testing centers, such as an individual school or school district. These cloned servers contain the testing content and protocols and are physically placed at or near the premises of the target test-taker population, generally within the same internal network as the target population. The cloned server may also use the “look ahead” technique discussed above. The physical proximity greatly decreases network latency from server to testing terminal, improving response time for the test-taker and reducing the potential for disruptions in concentration (subject to the capabilities of the local area network separating the testing terminals and the cloned test server). However, the cloned server becomes a security risk, and the costs for transporting, installing, and maintaining the cloned server are considerable.