Nowadays, textual contents are the most general information representation and usually include some crucial or key phrases therein. These key phrases can be highlighted by a mark for selecting and the mark can be an inverse video, an underline mark, a quotation mark, different colors or different fonts for the key phrase. Besides, the key phrases can also be marked by using various input tools, such as a keyboard, a mouse or an input pen. Further, the selected key phrases can be used for an advanced search or a keyword index. For example, the key phrase in a web page of a web site can include a hyperlink for connecting other web pages, or the key phrase in the web page may be marked by using the mouse and then pasted on various search engines on the Internet for searing the relevant articles.
Most types of information representations belong to textual contents with “sighting”, and only fewer types of information representations belong to audio contents with “hearing”. Recently, the mobile devices are becoming more and more popular. Further, it is better to “hear” messages in the mobile device than “sight” messages since the mobile device includes a smaller monitor. Moreover, there exist advanced techniques with the Bluetooth and wireless networks. Therefore, more and more information representations include audio contents with “hearing”, and thus how to select a key phrase from the audio contents will be a problem to be solved.
Besides, the textual contents with “sighting” are a parallel representation to express the information contents therein, and the audio contents with “hearing” are a sequential representation to express the information contents therein. Therefore, the key phrase can not be selected by using the exiting selecting procedures for the textual contents, such as the hyperlink or marking the key phrase with the mouse, to be suitable for the audio contents. Accordingly, how the user could efficiently interact with the audio contents has become an immediate requirement.
Therefore, the purpose of the present invention is to develop a system and method for selecting audio contents by using the speech recognition to deal with the above situations encountered in the prior art.