1. Field
The following description relates to a speech recognition technique, a method and apparatus for recognizing speech, and a method and apparatus for generating noise-speech recognition model.
2. Description of Related Art
Speech recognition has emerged as a core technology for future smart phones and intelligent software. However, there are a lot of technical obstacles that need to be addressed for a wide use of speech recognition in industries. A primary objective is to reduce the effects of interfering signals added to a speech, or to reduce effects of undesirable noises. Herein, a noise refers to all kinds of signals that may be added to a speech to be recognized. For example, a noise may include background noise from surroundings, communication line distortion, acoustic echo, background music, and voice from other people. A speech targeted for recognition may include these noises, thereby resulting in a degraded performance of a speech recognition apparatus.
In order to solve this drawback, techniques such as speech enhancement, feature compensation, model adaptation and the like have been developed; however, still users do not experience drastic improvement in the performance of speech recognition technologies.
Speech enhancement and feature compensation are techniques that infer or restore a clear speech signal from speech mixed with noise by using a signal processing technique and a data analyzing technique. In addition, model adaptation is a technique that takes into consideration speech mixed with noise at the beginning when constructing a speech recognition system.
It is widely viewed that model adaption yields more improved performance of speech recognition than speech enhancement and feature compensation. However, it is not possible to collect every noise and every speech mixed with noise in the word and to remove such noise; thus, model adaption has been used only in a specific domain such as speech recognition for automatic response system (ARS), for example.