1. Field
Apparatuses and methods consistent with exemplary embodiments relate to an electronic device and a voice recognition method thereof, and more particularly, to an electronic device capable of performing voice recognition using a user pronunciation lexicon and a voice recognition method thereof.
2. Description of the Related Art
The voice recognition service compares a user's voice with a text (e.g., word or a combination of words) that were registered to a system at a development stage and provides the most probable text result. At this time, the text registered in the system is generally referred to as a word lexicon, and a supporting range of the voice recognition service is determined according to how many words are included in the word lexicon. Further, the performance of a voice recognition service depends on the accuracy of a pronunciation lexicon in the word lexicon and a quality of an acoustic model that corresponds thereto.
Generally, the pronunciation lexicon is developed in order to include as many pronunciation variations as possible. Especially, for names of content and foreign words, the pronunciation variation per user varies, and therefore, more than 5 to 10 pronunciation strings for a word are provided. At this time, the purpose of generating multi pronunciation strings is to satisfy an average recognition rate of unspecified individuals who use the voice recognition service.
Conventionally, it aimed at generating a pronunciation string that satisfies an average recognition rate of unspecified individuals. However, it was not possible to reflect a pronunciation habit or characteristic of an individual user with such generic pronunciation, and it was difficult to provide a satisfying voice recognition rate.
In addition, a personalized service that provides an individual pronunciation string lexicon was launched as to overcome such limitation, but no specific and reliable method of generating an individual pronunciation string was introduced. The biggest reason was that it was not easy to determine a way to regulate due to a great variance between users.
Conventionally, technologies that update a pronunciation string based on a simple pattern analysis from a user log were introduced, but a pattern-based regulation cannot help but include errors because the voice recognition results include misrecognized words. Such conventional pattern-based updating method of a pronunciation string has caused side effects that decreased the existing recognition rate.