Field of the Disclosure
The present disclosure relates to a voice recognition method and a device thereof that improves a voice recognition performance in an electronic device.
Description of the Related Art
In general, various types of electronic devices such as a smart phone, or a tablet PC, etc. may include various voice interfaces capable of recognizing a user's voice and easily performing an operation desired by the user.
Voice interfaces are well-known and have been in widespread use as part of voice recognition technology, which convert the voice signal of the user input through a microphone of the electronic device into an electrical signal, and then analyzes the converted electrical signal, so as to recognize the user's voice as a command or a text.
Conventional voice recognition technology has performed voice recognition after receiving the input of a speaker's voice from a beginning to an end of the process. In recent years, a voice recognition function has been developed to which beamforming is applied in order to fulfill a growing need for multi-directional simultaneous voice recognition.
In general, for the voice recognition operations to which beamforming is applied, there is a problem that in a case where the beam formed direction is not toward the speaker, the user's voice may not be exactly input during a predetermined time (e.g., tracking time) (for example, 0.3 seconds) consumed for tracking the user's direction (position). For example, during a predetermined time, a first syllable of the voice may not be correctly input, or the syllable is cut-off and is then input to a voice recognition device of an electronic device. Therefore, the electronic device may not correctly receive the user's voice (for example, miss all or part of the first syllable) and thus there is a problem that the voice recognition rate decreases, causing user dissatisfaction. In addition, while the general voice recognition performs training using pre-modeled voices, the beamforming-applied voice recognition does not perform the training, so there is a problem regarding a reduced voice recognition rate of a voice input through the beamforming-applied recognition.