The present invention relates to language identification systems and methods for training and operating said systems.
With the growth of globalization, international business, and security considerations, multilingual speech applications are in strong demand, in particular automatic language identification (LID). Possible applications of automatic language identification include automatic call routing, audio mining, and voice automated attendant systems.
Acoustic-phonotactic based LIDs represent one type of language identification system employed in the art, an illustration of which is shown in FIG. 1. The system typically includes four stages operable to process a speech segment and to classify it into one of a number of possible candidate languages. Initially the system is trained, whereby the system is programmed to recognize particular features of each of the candidate languages. Subsequent to training, language identification operations are performed, whereby a speech sample of unknown language is processed and compared to the previously-programmed features to determine presence or absence of said features, the candidate language possessing the greatest number of correlations with the sample being deemed the language of the sample.
The conventional system suffers from several disadvantages, one being that a language specific development effort is needed to add a new candidate language. This requirement gives rise to high costs in the acoustic and language modeling and speech data transcription efforts needed. Accordingly, the conventional system is not very scalable with respect to adding new languages.
What is therefore needed is an improved spoken language identification system which provides better scalability with the addition of new candidate languages.