The present invention relates generally to a system and method for internet-based cognitive performance testing.
Systems and methods for computer based testing (CBT) are known to the art. For example U.S. Pat. No. 5,827,070 to Kershaw et al. discloses CBT means for administration of standardized test, e.g. SATs, LSAT, GMATs, etc. The system and method of Kershaw et al. does not depend on the speed and accuracy of the individual examinee""s keystroke responses to the test stimuli. Lewis et al. (U.S. Pat. No. 5,059,127) disclose a computerized mastery testing system providing for the computerized implementation of sequential testing. This disclosure also does not relate to the speed and accuracy of the individual examinee keystroke responses to the test stimuli. Swanson et al. (U.S. Pat. No. 5,657,256) disclose a method and apparatus for administration of computerized adaptive tests. Swanson is similarly unconcerned about the examinee response time.
When people measure their response time, recall or other cognitive skills using computer-based test systems, they typically press number or letter keys (or keys representing other symbols like circles and squares) in response to visual or auditory or other sensory signals presented to them. The average time they take to press the correct keys is their response time.
This type of measurement is subject to a number of errors that make response time results relatively imprecise. The effects of recent foods and beverages, medicines, amount of sleep and other factors that affect alertness or drowsiness all influence response speed and accuracy, so that measurements on any single day may not represent actual average performance level.
A key source of measurement error is change in motivation to respond quickly. One day a person may try quite hard to reduce their response speed. The next day they may relax and perform more slowly simply because they care less about their xe2x80x9cscorexe2x80x9d that day. Typically the error rate increases (incorrect responses are made more frequently) when people try harder to react quickly. Investigators commonly measure error rate to determine the xe2x80x9cresponse speed/accuracy tradeoffxe2x80x9d for each person or group of people.
While the response speed/accuracy tradeoff is usually discussed in connection with relatively simple responses, a similar tradeoff can occur during memory measurements when speed is only a secondary consideration. Response speed is intrinsically linked with recall accuracy because transient memory traces fade if the response (e.g. typing a list of words) is not completed rapidly.
Response speed may also vary from second to second and minute to minute as a result of boredom with the test, short-term fatigue from repeated motion, eye strain from staring at the computer screen, and stimulus patterns that confuse the user and cause response errors. Different types of transitions such as shifts between responses involving one hand and the other, or one finger and the corresponding finger on the other hand, can also affect response speed and accuracy for individual responses.
All of these factors together make precise performance measurement all but impossible. Even under controlled laboratory conditions, the correlation between test scores on one occasion and scores by the same individuals at a later time, averages only 0.63 (Salthouse and Babcock, 1991; Lowe and Rabbitt, 1998; Versavel et al., 1997; Wetherell, 1996). In other words, performance results can vary by plus or minus 20% from one day or week to the next.
The correlation between test results on separate days, called xe2x80x9ctest-retest reliability,xe2x80x9d is perhaps the most widely used indicator of measurement reliability. The average value of 0.63 has not changed appreciably during the last two decades, indicating that attempts to improve measurement reliability have generally met with little success.
Perhaps the best way to describe the need for measurement precision, and the need for this invention, is to discuss the circumstances of an individual who participated in a recent study to determine whether blueberries can reduce multiple sclerosis symptoms (Pappas et al., 2001).
SF is one of hundreds of thousands of people in the U.S. who have chronic, neurodegenerative diseases for which there is no cure. He cannot drive and cannot find work because his coordination and memory are affected. He must sell the home he, his wife and children live in because they need the money for his medical and dental expenses. His relatively expensive medicines give no apparent benefit. The medicines do however dry his mouth and cause his teeth to crack, causing him to lose three teeth during the last several months. Concerned about his dental bills, SF asked his dentist to remove his remaining teeth so he would not have to pay to have them repaired when he would lose them anyway. (His dentist refused.) His physician advised him to take a recommended performance test battery just once a year because he cannot afford the cost of more frequent evaluations. He must therefore wait for very long periods of time before obtaining objective evidence that his medications are or are not helping himxe2x80x94time he can ill afford since his disease is growing steadily worse. And of course after such long waiting periods, any performance benefits provided by his medications may be cancelled by the steady decline from his chronic illness.
SF can expect to decline at a rate which reduces his performance scores by roughly 4% to 10% each year. If his medicines are effective, his annual decline may be decreased by half a percentage point or perhaps several percentage pointsxe2x80x94however he most probably cannot measure this benefit because once-a-year testing is not accurate enough to measure changes smaller than 5%. Once-a-year testing will always be incapable of measuring changes of 5% or less simply because he may perform 5% or 10% better or worse than his average on the day when measurement is performed.
So the test results for which SF must wait so long, and pay so much for, are largely worthless to him and his physician since they will not be precise enough to indicate whether his medicines helped him.
SF clearly needs, and many thousands of other people in similar circumstances need, a test system that is accurate to within 1% or 2% so that effective treatments can be identified. He also needs a measurement system that is far less expensive than that recommended by his physician, so that he can obtain results many times each year. And he needs a test that can be taken at home, so that he is spared the effort and/or the cost of transportation to a test center.
For these and many other reasons, there is a clear need for increased measurement precision
One strategy used by scientists seeking greater precision is to reduce response time variability by discarding high and low responses within each test or test series. For example, the slowest half and the fastest quarter of response times may be discarded from each 30 seconds of testing, and the average of the remaining data obtained.
This type of data trimming certainly reduces variabilityxe2x80x94but it also reduces the amount of useable data and therefore reduces measurement precision, which is related to the amount of data. (As a general rule, precision is directly proportional to the square root of the number of data points, if approximately random variation is the cause of imprecision.)
Discarding high response times also prevents or sharply reduces the accuracy with which benefits or harm from different health strategies can be measured if performance changes occur primarily within the response times that are discarded. This occurred recently during an Danbury MS Blueberry Study (Pappas et al, 2001). Very slow response times were markedly reduced after blueberry consumption for many study participants, however this was not evident from trimmed data sets, from which all slow responses had been removed. Only when raw data was examined did the principal investigator see this benefit.
Scientists have also attempted to reduce measurement error by reducing practice effects that occur when examinees take the same or similar tests repeatedly. Gradual improvement due to practice is different for each individual and even for each type of response for each individual. Such gradual improvement can mask benefits of medication or other health strategies, or can mask harm due to exposure to pollutants, fatigue, etc.
To reduce practice effects, investigators have asked examinees to take tests many dozens of times, so that the learning period can be passed and further improvement due to practice will not occur. This practice-until-no-more-improvement-occurs strategy was not generally successful since improvement typically occurs over hundreds or even thousands of responses. This strategy is of course impractical for people like SF when the expense and effort of travel and testing are high. There is a clear need for test methods that reduce or eliminate practice effects.
Measurement precision and test-retest reliability has for the most part been ignored by inventors interested in reaction time and memory measurement. Only two previously patented performance measurement methods related to xe2x80x9creaction timexe2x80x9d have explicitly addressed the issue of test-retest reliability and measurement precision. None have evidently attempted to determine the precision with which response time measurements are made.
Wurtman (1984) obtained a test-retest reliability of 0.65-0.74 when evaluating an amino acid mixture for improving vigor and mood in normal human patients, however the method used to obtain this test-retest reliability was not the subject of his patent.
Using an electroencephalogram-based, computer-aided training method and 4 examinees, Gevins et al (1998) obtained an average xe2x80x9ctest set classificationxe2x80x9d of 95% (range 92%-99%) calculated by a trained pattern-recognition network. Their xe2x80x9ctest-retest reliabilityxe2x80x9d computation algorithm apparently had little to do with the (Pearson) correlation coefficient commonly used to determine test-retest reliability values. Their use of the phrase xe2x80x9ctest-retest reliabilityxe2x80x9d illustrates the difficulty that can arise when a term used to define measurement precision is given different meanings by different investigators.
Rimland (1988; U.S. Pat. No. 4,755,140) describes a hand-held reaction time test but does not determine either test-retest reliability or the precision with which reaction time is measured. His device that employs no signal sequence restrictions and other apparent methods for improving precision.
Reynolds et al. (1999; U.S. Pat. No. 5,991,581) developed an interactive computer program for measuring mental ability that automatically adjusts task complexity and selects letters or symbols with equal probability. No discussion of performance measurement precision or test-retest reliability is provided, and there is no determination of the precision with which response time measurements are made.
Buschke (1988; U.S. Pat. No. 4,770,636) describes a memory monitor that produces challenge signals 7 or 10 digits in length. He mentions no signal sequence restrictions that might improve measurement precision. His choice of 7 or 10 digit sequences quite likely results in frustration for individuals who cannot handle such long numbers and reduced precision for individuals who can handle 10 digits readily. His use of punctuation after three-digit segments within these longer sequences appears to be a step in the right direction since it will promote consistent xe2x80x9cchunkingxe2x80x9d of signals within and between data sets.
Buschke""s 1993 xe2x80x9ccognitive speedometerxe2x80x9d (1993; U.S. Pat. No. 5,230,629) involves relatively sophisticated control measurements but also does not determine measurement precision or employ signal-sequence restrictions. He does attempt to control the speed-accuracy ratio by keeping errors below an upper limit but does not ask examinee""s to proceed quickly enough to make at least a minimum number of errors. This allows considerable response speed variability since examinee""s may relax or proceed with greater vigor from time to time, without ever exceeding or even approaching his permitted level of errors.
Perelli (1984; U.S. Pat. No. 4,464,121) has developed a portable device for measuring fatigue effects that he did not determine test-retest reliability or measurement precision. He does however increase precision by blocking challenge signal repetition. No two signals in a row can be identical. His motivation for this restriction was not to improve measurement precision but to clearly indicate each new trial. Nevertheless his restriction is important since it removes trials where the signal is the same as that just presented, and therefore prevents examinees from responding more quickly to such signals than to others and therefore reduces variability among response times and increases measurement precision. He also does not encourage examinees to proceed quickly enough to make a minimum acceptable number of errors and therefore allows more response speed variability than optimal.
Keller""s response speed and accuracy measurement device (1992; U.S. Pat. No. 5,079,726) also does not allow the same digit twice in a row within each 5 digit signal, and several other restrictions are also imposed. 5-digit signals cannot begin with the number 1. Adjacent sequential digits are forbidden. And no digit may be used twice within the same 5-digit signal. He does not however place any restrictions on the frequency of digits or transitions between digits over a series of signals. Thus he permits one digit, say the number 2, to appear a disproportionate amount of the time during a series of measurements. If an examinee is especially fast or slow when pressing 2, his or her average response times will be reduced or elevated in comparison to other measurement sessions, response time variability will be increased and measurement precision will be decreased. He makes not effort to limit error rates to maximum or minimum levels and does not determine the precision with which response times are measured.
There exists a need to eliminate computer delay as a source of error.
Virtually all computers have hidden xe2x80x9cbackgroundxe2x80x9d processes that occur from time to time and compete with resources required for accurate time measurement. The problem is particularly severe in the most powerful, modern computers, which have large numbers of background processes. Every several minutes, one or another task is undertaken that delays response time measurement by approximately 5% or morexe2x80x94enough to increase measurement variability beyond the accuracy needed for precise assessment of medical benefits or performance effects from other potentially dangerous or life-saving activities, events or conditions. If several competing programs are active when measurement is made, as much as 100% of the computers central processing unit (xe2x80x9cCPUxe2x80x9d) time may be occupied, possibly for as long as or longer than several seconds.
FIG. 1 shows a screen shot of CPU usage in the absence of user-initiated activity obtained from a 200 MHz Windows NT 4 Gateway computer. Periodic, transient demands on CPU capacity are evident, including one relatively unusual spike up to 100% of CPU capacity that lasted several seconds before receding.
During the recent Danbury MS Blueberry Study (Pappas et al., 2001), when interference from background activities was measured before each keystroke during choice reaction time testing, occasional interference was recorded for all study participants, and most had potentially significant interference clusters from time to time (FIG. 2).
Performance results obtained during this past year during the Danbury MS Blueberry Study indicate that measurement error was limited to 1% or 2% (test-retest reliability was 0.991) and that practice effects were negligible when testing (and therefore practice) was limited to 2 minutes each week (FIGS. 3 and 4). Analysis of response times obtained after interference was detected indicates that apparent response times increased by roughly 7%, depending on the severity of the interference. This 7% error is large enough to a serious concern, but not so large that it cannot be reduced to insignificance by frequent (twice per second) precision checks and rejection of questionable data.
The precision improvement methods described in this patent application and employed during the Danbury MS Blueberry Study controlled measurement variability to a greater extent than expected and allowed data sets for individual participants to be split into separate performance measures for each finger used during response time testing. FIG. 4 contains a typical single-finger data set for one of the study participants. The steady, parallel changes observed for each finger indicate that measurement precision was quite sufficient for this type of single-finger monitoring.
A thorough search of prior art has indicated that average measurement precision among 77 different published performance tests was surprisingly low. Test-retest reliability was only 0.63. Results obtained this past year using the methods described herein yielded a test-retest reliability of 0.991. Accordingly, there exists a need for a method for increased measurement precision.
The present invention provides a computer based system for testing the cognitive performance of at least one examinee comprising: comprising: at least one source network entity (SNE) having machine readable instructions, at least one test development system, local memory, and a plurality of executable files stored in said memory; a data distribution system (DDS) logically connected to said source network entity; and at least one destination network entity (DNE), having local memory, logically connected to said data distribution system.
The present invention provides a system for internet-based testing comprising a plurality of subsystem including: a test development system; a data distribution system; a workstation; a workstation calibration system; an examinee monitoring system; and an examinee motivation system.
According to an aspect of the present invention, a test development system is provided. The test development system comprises a digital computer provided appropriate software such as an operating system and means for generating digital representations of challenge signals to be presented to an examinee. Signals may be numbers, letters, words, other symbols, sounds or combinations of these and/or other response triggers. The signals may be presented singly or in any combination of the plurality of possible signals. The test development system further comprises appropriate software, databases and digital storage means. The test development system provides a definition file defining specific information said test development system requires and a format in which said specific information is to be provided, at least one examinee information file, at least one examinee response file.
According to an aspect of the invention, the test development system is logically connected, in computer fashion to data transmission means. Such a connection may be for example a modem or cable modem connection to the internet. In such case the data transmission means comprise the internet.
According to an aspect of the present invention, a data distribution system is provided. The data distribution system
According to an aspect of the present invention a computer based method for testing the cognitive performance of at least one examinee is provided. The method comprises the steps of:
(a) providing a computer based testing system comprising: at least one source network entity (SNE) having machine readable instructions, at least one test development system, local memory, and a plurality of executable files stored in said memory; at least one data distribution system (DDS) logically connected to said source network entity; at least one destination network entity (DNE) logically connected to said data distribution system, wherein said DNE has local memory;
(b) generating a computer signal train comprising said at least one set of instructions, said at least one test development system and said plurality of executable files and transmitting said computer signal train to said data distribution system;
(c) embodying said computer signal train in a carrier wave using said data distribution system;
(d) distributing said carrier wave embodying said computer signal train to said destination network entity;
(e) displaying general and motivational instructions to said examinee;
(f) obtaining information relating to examinee health history and cacheing said information in DNE memory;
(g) calibrating said destination network entity, wherein said calibration is performed iteratively prior to each response;
(h) displaying at least one softshifted challenge signal;
(i) measuring at least a first cognitive performance of said examinee, wherein said measurement is bounded by pre-determined error limits;
(j) providing performance feedback to said examinee;
(k) providing motivational feedback to said examinee; and
(l) providing summary information to said examinee.
According to an aspect of the invention, a computer-based performance measurement system is provided that provides more precise results than previously available, for at least some measures of performance.
According to an aspect of the invention means are provided for obtaining more precise performance data than previously possible, so people, and/or their physicians, can determine how to improve their health, so that scientists can conduct more precise performance research, and so that other people interested in their performance can obtain more reliable, more convenient and more affordable performance measurements.
An aspect of this invention is the linked storage of information about 1) performance, 2) computer measurement accuracy, 3) health and 4) health-related activities and events, including foods, beverages and medications consumed, exercise, sleep, social events and any activity or event that may possibly affect health or performance. Storage may be in one or more data files but must be accomplished to enable information in each of these four categories to be linked together so that logical conclusions can be reached. A key aspect of the information stored in each category is the date and time of each measurement, aspect of health, activity or event.
Time stamps allow performance results to be rejected or corrected if measurement precision calibration results obtained immediately before or afterward raise doubts about measurement accuracy at that time. Computer measurement error typically occurs when other background processes (xe2x80x9cinterferencexe2x80x9d) prevent timely execution of the measurement software commands. Such interference usually occurs for relatively short periods of time, so performance data can be rejected if it was obtained at approximately the same time interference was detected. Rejecting just some data while keeping results obtained when calibration results are satisfactory allows more data to be used and therefore increases measurement precision, even for computers subject to relatively high levels of transient background interference.
Time stamps also allow performance, health and health activity information to be related.
According to an aspect of the invention, changes in examinee response time and short-term memory are measured. Changes in examinee response time and short-term memory may have important medical-diagnostic value, indicating for example local areas of hypoxia (low oxygen) or other transient or progressive health problems, and may provide a relatively precise measure of the effectiveness of different doses and combinations of medications and health supplements for the individual examinee.
A further aspect of the invention provides measures of cognitive performance having precision sufficient to measure changes in the performance of individual examinees, rather than just changes among groups of examinees.
According to an aspect of the invention, means are provided for relating ingestion of dietary components or supplements, medications or other drugs, or alcohol to changes in cognitive performance.
According to an aspect of the invention, means are provided for increasing the number of performance measurements obtained per unit time per examinee and means are provided for increasing the precision of those measurements. Therefore, also is provided means to decrease proportionately the cost of long-term experiments and to enable research protocols that otherwise would be too expensive to be funded.
According to an aspect of the invention, a response time measurement system is provided that instructs users to remain above a minimum error rate and/or specifies a relatively narrow range of recommended error rates.
According to an aspect of the present invention, methods for reducing measurement error are applied to virtually any computer-based performance measurement system, whether the challenge signals comprise numbers, letters, words, other symbols, sounds or combinations of these and/or other response triggers or whether single responses or a series of different responses are required.
According to an aspect of the present invention, methods are provided applicable to a variety of response time measurements (such as simple and choice response time, digit-symbol substitution tests and memory scanning tests) and also to memory measurements (such as number recall, word recall and word pair recall).
According to an aspect of the present invention, use of the Internet for repeated and more precise performance measurement may provide scientists with both an opportunity and a previously missing spark for development of global standards for performance tests that will speed many different areas of health research.
Still other objects and advantages of the present invention will become readily apparent by those skilled in the art from the following detailed description, wherein it is shown and described preferred embodiments of the invention, simply by way of illustration of the best mode contemplated of carrying out the invention. As will be realized the invention is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, without departing from the invention. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.