Medical certifying organizations have traditionally relied upon paper and pencil cognitive examinations as a method for the assessment of the candidate's medical knowledge. Traditional formats such as multiple choice questions have well-defined operating characteristics and reliability for examining cognitive knowledge capabilities. See, for example, Stocking ML, An alternative method for scoring adaptive tests, Research Report RR-94-98, 1994, incorporated herein by reference.
However, these tools generally measure in only cognitive knowledge. These methods provide only primitive ability to assess a candidate's problem-solving abilities. See, for example, Stillman P L, Swanson D B, Ensuring the clinical competence of medical school graduates through standardized patients, Arch Int Med 1978, Vol. 147, pages 1049-52, incorporated herein by reference.
Several organizations have previously experimented with computer-delivery of clinical content and evaluation. In the late 1960s and 1970s, the Ohio State University developed a self-directed Independent Study Program which utilized a “Tutorial Evaluation System,” for conveying curriculum content. See, for example, Weinberg A D, CAI at the Ohio State University College of Medicine, Comput Biol Med 1973, Vol. 3, pages 299-305; Merola A J, Pengov R E, Stokes B T, Computer-supported independent study in the basic medical sciences in: DeLand E C (ed). Information Technology in Health Science Education, Plenum Press, New York, 1973, incorporated herein by reference.
Co-synchronously Dr. Octo Barnett's laboratory at the Massachusetts General hospital began development of clinical simulations. See, for example, Barnett G O, The use of a computer-based system to teach clinical problem-solving, Computers in Biomedical Research, Academic Press, New York 1974;, Vol. 4, pages 301-19; Barnett G O, Hoffer E P, Famiglieti K T, Computers in medical education: present and future, Proceedings of the Seventh Annual Symposium on Computer Applications in Medical Care, IEEE Press, Washington, DC 1983, pages 11-13, incorporated herein by reference. The clinical simulations used the MUMPS language.
At approximately the same time, investigators at the University of Illinois developed a simulation model known as (Computer-Associated Simulation of the Clinical Encounter, or “CASE”). See, for example, Harless W G, Farr N A, Zier M A, et al., MERIT—an application of CASE, Deland E C (ed), Information Technology in Health Science Education, Plenum Press, New York 1978, pages 565-69, incorporated herein by reference. This system was at one time considered by the American Board of Internal Medicine (ABIM) as at least one component of a recertification process. Friedman R B, A computer program for simulating the patient-physician encounter, J Med Educ 1973, Vol. 48, pages 92-7, incorporated herein by reference. Research supported by the ABIM demonstrated that a computerized examination system appeared feasible in professional evaluation/certification settings. Reshetar, R A, et al., An Adaptive Testing Simulation for a Certifying Examination, presented at the Annual Meeting of the American Educational Research Association, San Francisco, Calif., April, 1992, incorporated herein by reference.
Stevens and colleagues have also demonstrated the feasibility of using computer-based systems for testing problem-solving ability in undergraduate medical school curriculum applications. See, for example, Stevens R H, et al, Evaluating Preclinical Medical Students by Using Computer-Based Problem-Solving Examinations, Academic Medicine 1989, Volume 64, pages 685-87, incorporated herein by reference. Sittig and colleagues have also examined the utility of computer-based instruction in teaching naive users basic computer techniques such as “drag and drop” and other computer operations. See, for example, Sittig D F, Jiang Z, Manfre S, et al., Evaluating a computer-based experiential learning simulation: a case study using criterion-referenced testing, Comput Nurs; 1995, Vol. 13, pages 17-24, incorporated herein by reference.
We have determined that the above described medical assessment processes suffer from two weaknesses: 1) test development requires re-generation of an examination with new material on a recurring (usually annual) basis; 2) although multiple choice questions demonstrate reliable performance in measuring cognitive knowledge, the use of this format for assessing clinical problem solving has not been supported by the literature. Another system was developed at the University of Wisconsin. This project served as the nidus for the Computer-Based Examination (CBX) developed by the National Board of Medical Examiners (NBME). See, for example, Friedman R B, A computer program for simulating the patient-physician encounter, J Med Educ 1973, Vol. 48, pages 92-7; Clyman, Stephen G., Orr, Nancy A., Status Report on the NBME's Computer-Based Testing, Academic Medicine 1990, Vol. 65, pages 235-41, incorporated herein by reference. NBME's CBX development project has been in evolution for over a decade, and has demonstrated validity in examining professional degree candidates. See, for example, Solomon D J, Osuch J R, Anderson K, et al., A pilot study of the relationship between experts' ratings and scores generated by the NBME's computer-based examination system, Academic Medicine 1992, Vol. 67, pages 130-32, incorporated herein by reference.
However, we have determined that the CBX model suffers from the problem that the clinical simulations are “hard-wired” in computer source code which must be re-coded for each new examination. Once the simulation has been used widely, the examination contents are no longer secure, necessitating continuous cycles of new simulation development.
The expert system literature describes the evolution in evaluation and training systems. Early artificial intelligence/expert system work concentrated on “rules of thumb” or heuristics to represent problem-solving strategies identified by domain experts. See, for example, David J M, Krivine J P, Simmons R., Second generation expert systems: a step forward in knowledge engineering, in: David J M, Krivine J P, Simmons R. Second Generation Expert Systems, Springer Verlag, New York, N.Y. 1993, pages 3-23, incorporated herein by reference. We have determined that these rule-based systems were necessarily constrained to narrow domains, and that the knowledge contained in the rules was difficult to validate. Id.
In addition, early expert systems suffered from rapidly declining performance when exposed to circumstances outside narrowly defined domains. See, for example, Davis R. Expert systems: where are we and where do we go from here, AI Magazine, 1983, Vol. 3, pages 3-22; Simmons R. Generate, Test and Debug: A paradigm for combining associational and causal reasoning, in: David M, Krivine J P, Simmons R., Second Generation Expert Systems, Springer Verlag, New York, N.Y. 1993, pages 79-92, incorporated herein by reference. We have determined that this phenomenon occurred at least in part due to interactions among the many rules needed to define a domain. Recent work indicates that the robustness of such systems is enhanced by providing knowledge of different types. See, for example, Simmons R. Davis R., The roles of knowledge and representation in problem solving, In: David M, Krivine J P, Simmons R., Second Generation Expert Systems, Springer Verlag, New York, N.Y. 1993, pages 27-45, incorporated herein by reference.
We have further determined that experts generally not only relate to one dimension of knowledge when defining a rule, but also rely upon expansive knowledge of how systems work (i.e., physiology and pathophysiology in the medical domain) in performing real-world problem-solving. See, for example, Davis R., Expert systems: where are we and where do we go from here, AI Magazine, 1983, Vol. 3, pages 3-22, incorporated herein by reference. This realization has led to re-thinking regarding structure of knowledge-based systems to reflect the tasks such a system should accomplish, the methods the system should use to accomplish the tasks, and the knowledge required to support these methods. See, for example, David J M, Krivine J P, Simmons R., Second generation expert systems: a step forward in knowledge engineering, In: David J M, Krivine J P, Simmons R., Second Generation Expert Systems, Springer Verlag, New York, N.Y. 1993, pages 3-23, incorporated herein by reference.
We have also determined that knowledge-acquisition for such systems entails development of a model for the domain and instantiation (i.e., encode and enter needed information into the system's data structure) of the model with information acquired from knowledge donors. See, for example, David M, Krivine J P, Simmons R., Second generation expert systems: a step forward in knowledge engineering, In: David M, Krivine J P, Simmons R., Second Generation Expert Systems, Springer Verlag, New York, N.Y. 1993, pages 3-23; Breuker J, Weilenga B., Models of expertise in knowledge acquisition, In: Gida and Tasso (eds), Topics in Expert System Design: Methodologies and Tools, North Holland Publishing, 1989, incorporated herein by reference.
To obviate the above described weaknesses, we have determined that it is desirable to provide a computer-based testing project which will: 1) instantiate medical knowledge as object-oriented data structures known as knowledge base of family medicine; 2) utilize the medical knowledge structures to create realistic clinical scenarios (simulated patients); and 3) assess the candidate's clinical problem solving ability as the effective intervention in the clinical progress of the simulated patient through the selection of various actions made available by the testing system.