Automated systems for evaluating highly predictable speech have emerged in the past decade due to the growing maturity of speech recognition and processing technologies. However, endeavors into automated scoring for spontaneous speech have been sparse given the challenge of both recognizing and assessing spontaneous speech.
A construct is a set of knowledge, skills, and abilities measured by a test. The construct of a speaking test may be embodied in the rubrics that human raters use to score the test. For example, the construct of communicative competence may consist of three categories: delivery, language use, and topic development. Delivery refers to the pace and the clarity of speech, including performance, on intonation, rhythm, rate of speech, and degree of hesitancy. Language use refers to the range, complexity, and precision of vocabulary and grammar use. Topic development refers to the coherence and fullness of the response.
The delivery aspect may be measured on four dimensions: fluency, intonation, rhythm, and pronunciation. Pronunciation may be defined as the act or manner of articulating syllables, words and phrases including their associated vowels, consonants, and word-level stresses. Pronunciation is a factor that impacts the intelligibility and perceived comprehensibility of speech. Because pronunciation plays an important role in speech perception, features for assessing pronunciation are worth exploring, especially in the area of measuring spontaneous speech, which remains largely neglected.