Interactive language proficiency testing systems using speech recognition are known. For example, U.S. Pat. No. 5,870,709, issued to Ordinate Corporation, describes such a system. In U.S. Pat. No. 5,870,709, the contents of which are incorporated herein by reference, an interactive computer-based system is shown in which spoken responses are elicited from a subject by prompting the subject. The prompts may be, for example, requests for information, a request to read or repeat a word, phrase, sentence, or larger linguistic unit, a request to complete, fill-in, or identify missing elements in graphic or verbal aggregates, or any similar presentation that conventionally serves as a prompt to speak. The system then extracts linguistic content, speaker state, speaker identity, vocal reaction time, rate of speech, fluency, pronunciation skill, native language, and other linguistic, indexical, or paralinguistic information from the incoming speech signal.
The subject's spoken responses may be received at the interactive computer-based system via telephone or other telecommunication or data information network, or directly through a transducer peripheral to the computer system. It is then desirable to evaluate the subject's spoken responses and draw inferences about the subject's abilities or states.
A prior art approach to automatic pronunciation evaluation is discussed in Bernstein et al., “Automatic Evaluation and Training in English Pronunciation,” Int'l. Conf. on Spoken Language Processing, Kobe, Japan (1990), the contents of which are incorporated herein by reference. This approach includes evaluating each utterance from subjects who are reading a preselected set of scripts for which training data has been collected from native speakers. In this system, a pronunciation grade may be assigned to a subject performance by comparing the subject's responses to a model of the responses from the native speakers.
One disadvantage of such an evaluation system is that it may not properly weigh the importance of different items with regard to their relevance to the assessment. A further disadvantage to this evaluation technique is that it typically does not account for the accuracy, or more importantly the inaccuracy, of the speech recognition system. Known speech recognition systems may interpret a response incorrectly. For example, speech recognition systems typically are implemented with a predetermined vocabulary. Such a system is likely to react inaccurately to a response that falls outside of the vocabulary. Speech recognition systems also may make errors in recognizing responses to items that are in the vocabulary, particularly short words. As used herein, “recognizing” a response means recognizing the linguistic content and/or other characteristics of the response. The accuracy of the speech recognition system may be thought of as a measure of the character and quantity of errors made by the speech recognition system. It would therefore be desirable to have an improved automated language assessment method and apparatus.