Speech recognition for multiple accents of the same language poses a challenge to the embedded devices community. Usually, this problem is solved across different, largely separated, geographies by having different acoustic models for the varied accents. For example, North American, British, Australian, and Indian English have different acoustic models for recognition.
Even with each acoustic model, regional accents may provide additional challenges. For example, although English is usually the second most spoken language after the respective regional mother tongue in India, there are a number of regional English accents across different parts of India. These regional accents pose a challenge to speech recognition that is based on a single acoustic model. Speech recognition may use multi-accent recognition systems employing multiple accent-specific recognizers in parallel. Running multiple accent-specific recognizers with different acoustic models in parallel to improve recognition accuracy can be processor intensive. This intensive resource usage may be particularly challenging for embedded devices with limited processing power. In addition, development and usage of accent specific acoustic models may not be cost effective.
One technique for overcoming the multi-accent issue is to do an analysis of phonetic pairs that are most often confused and form phonetic transfer pairs. These pronunciation transfer pairs are then plugged into the original canonical lexicon, and finally a new dictionary adapted to the accent is constructed. In essence, the approach involves substituting the unused native accent phonetics by the most probable phonetic symbol combinations for the accented pronunciation. This analysis might not be possible with limited or no access to either the acoustic models or the symbols recognized by the recognition engine internally.