1. Field
The present disclosure relates to an image display apparatus, a method for driving the same, and a computer readable recording medium, and more particularly to an image display apparatus, a method for driving the same, and a computer readable recording medium, which can normalize a format of the result of speech recognition to perform an operation of a device, such as a TV or a portable phone.
2. Description of the Related Art
Interactions between human and device have evolved toward convenience and nature. Among them, speech recognition is the most intuitive and easiest interaction that human can use. Spontaneous speech recognition has been used in various devices since it can recognize various vocabularies and interactive sentences. However, in the case of using a spontaneous speech recognition engine, a great variety of vocabularies are output, and the recognition result may sometimes come out in a manner that the recognition result has the same pronunciation and the same semantic as the title of a function or content that is actually executed in a device, but has a different language, for example, English other than Korean. This is because although the speech has been properly recognized, the device matches the semantic that is in the form of a text with the title of the function or content. In this case, the function may not be performed. In order to solve this problem, post-processing technologies using correlations and parallel corpora have been proposed. That is, various technologies related to post-processing for improvement of a speech recognition error rate and a recognition rate.
Most of such technologies are related to methods to improve the recognition rate and a recognition error rate using a corpus DB in which errors and correct answers match each other or a system that extracts features from an input speech and determines correlations between the input speech and registered words. Such technologies can improve accuracy of a sentence created by a user or solve an error of the recognition result. However, in the case of using the spontaneous speech recognition engine, the recognition result may come out, due to the various vocabularies, in a manner that the recognition result has the same name and pronunciation as the same function or the same content that actually operates, but has a different format. In this case, even if the error of a text is corrected, the function that is desired by a user may not be performed.
For example, Korean Unexamined Patent Application Publication No. 10-1998-0056234 entitled “Post-processing speech recognition method using correlations” discloses a post-processing method of a speech recognition system. Using this technology, if the result of speech input through a microphone is not found in a registered command set, correlations are registered through comparison of distances between a specific pattern and patterns of currently registered words. Accordingly, in the case where the same speech is input later, recognition thereof becomes possible since the corresponding speech has been registered through the correlations.
Further, Korean Unexamined Patent Application Publication No. 10-2012-0141972 entitled “Method and apparatus for correcting errors of speech recognition” discloses an error correction technology using parallel corpora in a speech recognition system. According to this technology, a correct answer corpus and a parallel corpus including a correct answer pair and an error pair as the recognition result are generated. If an erroneous recognition result comes out, a recognition error portion is found from the result, and the result is replaced by the correct answer pair that matches in the parallel corpus.
With the development of the spontaneous speech recognition function, the spontaneous speech engine, that is, the speech recognition that uses the recognition result provided from an external server, can recognize various vocabularies and words. However, as an adverse reaction thereof, the recognition result may come out in a state where the format of the functions used in the device differs from the format of the spontaneous speech recognition result. Further, as content, such as broadcasts, movies, and music, is continuously produced, and such content can be easily obtained not only in the country in which the content is produced but also abroad through paths, such as Youtub*, even the same content is sometimes expressed in another word or language. Accordingly, in order to accurately recognize and execute the content which has the same pronunciation and the same semantic, but has the title in a different language, for example, English, it is necessary to provide a process of normalizing the format of the speech recognition result, which converts the word into the name of the function or content that is actually executed.
In other words, in order to supplement the problems involved in the related art and to heighten the execution rate of the function or content that a user intends to execute, there has been a need for a technique of normalizing the format of the speech recognition result for the operation of a device using speech recognition, such as a TV.