Speech recognition generally refers to a technique to convert speech in one language into text of same or another language, or speech in one language to speech in another language by using a computer. In other words, speech recognition is to automatically translate one language into another language without the involvement of human labor, and by use of digital processing ability of computers. Speech recognition techniques allow speech-based translation (STT) and text-based translation TTS (text-to-speech) for multiple languages where speech can be transformed into the text of any language and text can be translated into speech of any language.
Since, the accents of different groups of speakers differ due to regional or social status, norms and practices, language accents pronunciation may be influenced. Further, language pronunciation may also be influenced by second language speakers. For example, a person whose first language is Kannada (a South Indian language) may speak Hindi (the national Indian language) with a kannadian accent, or a person whose first language is Hindi may speak English with a Hindi accent.
Generally, English has grown in importance as a language for international communication throughout the world. Particularly, the blend of English with local languages and dialects in different countries has given rise to wide diversity in the manner of pronunciation and accent used of English. Is Asia-pacific region, much of influence can be seen in regions such as Greater China, India, Malaysia and the Philippines thereby exhibiting rich variation in English pronunciation, lexicon and grammar.
Relative to the standard languages such as Hindi and English; and their pronunciation, the non-standardized accents of these languages generally include phonetic variations due to regional and mother tongue influence. Since the phonetic variations in a standard language usually result in low recognition rates for speech recognition systems, a comprehensive understanding of the variations present in the dialects of English spoken across the world today is a concern for the development of spoken language science and speech recognition technology.
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.