1. Field of the Invention
The present invention relates to a speech recognition technique, and more particularly, to a speech recognition method using machine learning capable of improving performance of a spoken chatting system by re-ranking multiple candidate sentences detected as a result of speech recognition of user's speech and selecting an optimal candidate sentence as a speech recognition result.
2. Description of the Prior Art
Voice is the most common and convenient information transfer means used by human beings. Speech represented by voice is used as means for operating various devices as well as means for communication between human beings.
In recent years, speech recognition as a technique for interfacing between the human beings and the devices has been greatly required due to advances in performance of computers, development of various media, advances in signal and information processing technologies.
According to the speech recognition, when a wave pattern of an input speech signal is given, the most similar pattern is detected by comparing the input wave pattern with a reference pattern. The task of detecting the reference pattern which is most similar to the wave pattern of the input speech signal may be summarized to include a learning process of generating the reference pattern and a recognition process of recognizing the input speech signal by using the reference pattern generated in the learning process.
As an example of the speech recognition technique, Korean Patent Publication No. KR 10-2009-0119043 discloses an “interactive language learning device”. The interactive language learning device disclosed in the aforementioned Patent Document is configured to include: a phrase recognition unit which counts the number of phrases existing in an input user speech signal by analyzing the user speech signal in a first dialogue level; a sentence searching unit which searches whether or not a correct answer sentence matching with the counted number of phrases exists in the first dialogue level; and a control unit which, in the case where the correct answer sentence matching with the counted number of phrases is detected, controls so as for a question sentence in a second dialogue level matching with the detected correct answer sentence to be output.
In addition, Korean Patent Publication No. KR 10-2000-0032056 discloses an “interactive learning auxiliary device and a dialogue analysis method Using the Same”. The interactive learning auxiliary device is configured to include a dictionary storage unit for supplying data necessary for morpheme and syntax analysis and meanings and discourse analysis; a knowledge-based storage unit for supplying data necessary for dialogue analysis; a speech/text conversion unit for converting a speech input through a microphone into a text having the same meaning; a morpheme analysis unit, a syntax analysis unit, and a meaning interpreting unit which sequentially perform morpheme analysis, syntax analysis, and meaning analysis by comparing data output from the speech/text conversion unit with data of the dictionary storage unit; a discourse analysis unit which performs omission and substitution on the data output from the meaning interpreting unit with reference to the data of the dictionary storage unit; a dialogue manager which compares the data output from the discourse analysis unit with the data of the knowledge-based storage unit to convert the data into lower-level category information of declinable words and performs dialogue act, in-area keyword, in-area compatibility determining, database query word generating and searching, and the like; and a response generator which compares the data output from the dialogue manager with the data of the speech/text conversion unit to generate a sentence to be supplied to the user based on the dialogue act, database search results, and the in-area compatibility.
As described above, the speech recognition technique is applied to various fields such as a digital language learning device or a chatting system.
Particularly, the above-described spoken chatting system generates a response by using a pattern matching or searching method or the like based on a first-rank sentence in user speech recognition results. Therefore, there is a problem in that, if the speech-recognized first-rank sentence is a misrecognized sentence, an erroneous response is always generated. Furthermore, although the performance of the speech recognition is improved, the speech recognition does not always provide a correct recognition result.
Therefore, in the related art, development of a technique capable of improving the speech recognition result by using only simple processes is greatly demanded.