This invention relates to a method and system for computer-based testing and computer-based analysis and feedback, and in particular to methods and systems which analyze, grade and diagnostically comment upon students' free responses to computer science problems.
The background for this invention is in two fields; standardized testing and analysis and feedback systems.
In the standardized testing field, large-scale standardized testing has long depended upon the administration by paper-and-pencil of multiple-choice test items, which can be stored efficiently and objectively by modern scanning technology. Such tests have several disadvantages, including that they cannot be dynamically adjusted to the skill levels of the examinees, that they must be scored in a central location where the scoring apparatus resides, and that they depend on problems that may not adequately represent the tasks which examinees may encounter in academica and work settings.
Advances in computer technology have recently been applied to the testing field. For example, adaptive administration techniques provide a test which is dynamically tailored to the skills of the examinee and instantaneously scored in order to allow the examinee to know how he or she performed as soon as the test is concluded. Additionally, systems have been developed for automatically grading examinees' responses to "free-response" items. A free-response examination is one in which the examinee must provide an uncued response to an open-ended question, for which there are no answers from which to select (as there are in the multiple-choice format).
None of the prior art systems have been successfully applied to testing in computer science. Thus, in order to assess an individual's ability to write computer programs, school systems, colleges and businesses continue to rely on paper-and-pencil tests which are graded individually by humans. These paper-and-pencil tests are necessarily of limited value because they require the examinee to demonstrate programming skills without a computer and are expensive to score because each examineers program must be graded by a human computer science expert.
Accordingly, computer-based systems are needed that will require the examinee to develop a computer program in response to a variety of programming problems. The system should be able to present various types of free-response test items, in response to which the examinee must provide entire programs or portions thereof. The system must be able to automatically evaluate and score the student's responses.
In the field of analysis and feedback, it is desired to be able to provide automated assistance to computer science students in the form of an intelligent computer-based analysis and feedback system. Such systems should be capable of reading programs which have been entirely or partially input by the student, identifying various errors therein and providing critiques back to the student.
The errors which a desired analysis and feedback system should be able to detect include syntax, semantic, and problem-specific errors. An example of a syntax error is a missing semi-colon at the end of a program line; syntax errors such as these are detected by standard compilers available in the prior art. Semantic errors are those such as the declaration of a variable to be an integer and using it as a character in the program.
Problem-specific errors, which are most difficult to diagnose, involve errors in programming logic. For example, a student may shift data through an array in the wrong direction. This problem-specific error is not detected by computer-based analysis and feedback systems currently available.
A computer-based analysis and feedback system which can diagnose problem-specific errors in programming logic as well as semantic and syntax errors would be ideally carried out by an expert system. An expert system, which is an application of the science of artificial intelligence, is used to diagnose problems in a specialized area which here is computer science. In particular, an expert system comprises two main components: an inference engine and a knowledge base. The inference engine contains general logical rules related to the computer science language under analysis. The knowledge base is a repository of data representing expertise in the computer science language, and is used by the inference engine to solve problems presented. Knowledge bases are constructed by knowledge engineers, who utilize information obtained from an expert in computer science in order to translate the expertise into a language a computer can comprehend.
The critiques to be provided by the desired expert system serve two purposes: (1) to assist students while they are debugging their programs, and (2) to be used by teachers as evaluations of the students' work. Thus, the desired expert system should be both a debugging aid and a grader of student programs.
Further, an automated analysis and feedback system should meet certain criteria. First, teachers need to be provided with a predetermined selection of programming exercises from which they can choose for inclusion in their curricula.
Second, the system needs to contain a large amount of information about the programming exercises that teachers might assign and the many and various problems that students may have in solving the exercises.
Third, a user-friendly interface, which would simulate the personal interaction of an instructor or teaching assistant, is desired. The interface should include techniques for highlighting pertinent text in a student program on the screen and for managing windows which display the text of the feedback to the student.
Fourth and finally, an intelligent analysis and feedback system must be founded upon a machine executable language that allows articulation of knowledge of programming that will be used by the program to diagnose and explain errors. Advice for students is articulated in this knowledge base language by knowledge engineers with expertise in instructing novice programmers.
The LISP-TUTOR software, developed by Anderson et al., and the PROUST software, developed by Johnson are examples of intelligent systems found in the prior art which attempt to address some, but not all, of the four criteria mentioned above.
The PROUST software provides only feedback on residual errors in the students' program and does not interactively guide the student in actual coding. However, the PROUST program's functionality has been convincingly demonstrated on only one set of student programs written to fulfill one programming problem. (Although Johnson was able to use the PROUST program to grade student solutions for a second programming problem, it was done only off-line, and only after many special purpose mechanisms were added to the PROUST software.) PROUST software provided only minimal coverage; teachers were provided with only one programming problem that could be used in their curricula; that same programming problem was the only one that PROUST could analyze and give advice on. Further, there existed no functional interface; messages from the PROUST program were simply stored in a text file.
The LISP-TUTOR software performs an analysis of each token, or word, immediately after it is input by the student. The LISP-TUTOR has specific predetermined word patterns stored in its memory, and compares each word input by the student in order to determine if the sequence follows one of the predetermined stored patterns. That is, in the LISP-TUTOR system, the knowledge base expects to encounter a program tree structure which is top-down, depth first, and left to right. The LISP-TUTOR is capable of detecting only the first error in this order. As such, LISP-TUTOR does not analyze the entire solution, as a whole, which would be presented by the student. Because the LISP-TUTOR system never allows a student's program to contain more than one bug, students using it are never faced with the normal "multiple-bug" demands of debugging, one of which demands is the perpetual uncertainty of not knowing how many different pieces of a program contain errors. In addition, students can only attempt to solve problems on the LISP-TUTOR system which were presented by the LISP-TUTOR system.
Accordingly, an object of this invention is to provide an analysis and feedback method and system for students in computer science which will allow them to interactively develop and debug their programs through immediate, on-line diagnostic feedback.
A further object of this invention is to provide a method and system for students in computer science to be given an examination in developing and debugging programs, by means of a computer, where the computer automatically evaluates the students' responses to test questions and assigns a numerical grade thereto.
A further object of this invention is to provide such a computer science analysis and feedback and testing system which will analyze students' programs after they have been completely entered and provide an accurate analysis for a wide range of potential correct and incorrect solutions.
A still further object of this invention is to provide a computer science analysis and feedback and testing system which will detect and provide an analysis of multiple errors in a program.