During the past few years, there has been significant interest in developing new computer based techniques in the area of language learning. An area of significant growth has been the use of multimedia (audio, image, and video) for language learning. These approaches have mainly focused on the language comprehension aspects. In these approaches, proficiency in pronunciation is achieved through practice and self-evaluation.
Typical pronunciation scoring algorithms are based upon the phonetic segmentation of a user's speech that identifies the begin and end time of each phoneme as determined by an automatic speech recognition system.
Unfortunately, present computer based techniques do not provide sufficiently accurate scoring of several parameters useful or necessary in determining student progress. Additionally, techniques that might provide more accurate results tend to be computationally expensive in terms of processing power and cost. Other existing scoring techniques require the construction of large non-native speakers databases such that non-native students are scored in a manner that compensates for accents.