At the present, the technology for accurate keyword detections still has plenty of room for improvements. Detections of one or a series of keywords from audio signals could be used for various purposes which may include storage, transmission, speech recognition, speaker identification, and so forth. For example, a keyword detection mechanism could be useful for an application in which an electronic device could be controlled remotely by a human voice. After audio signals have been captured, signal processing algorithm is required to discern not only the exact words being uttered but also possibly the grammar or sentence structures. Currently, better implementations of algorithms to improve the signal quality is still needed to increase the quality and accuracy of the keyword detections. Discernment of exact words currently can be performed by an automatic speech recognition (ASR) engine.
A current keyword detection method may have various difficulties which would need to be addressed. For example, the result of a state of the art ASR engine is still not very accurate, and thus a post recognition algorithm could be required. Also it is important to judge the presence of user in a desired direction from the microphone and to remove other interference from undesired direction. Also this means that the recording device can be continuously listening to desired direction and will not be triggered in to action by noise from other directions. For real time applications, reducing computation time is of high priority and also for constant computation time, a good buffering strategy is needed.
Moreover, the presence of non-human speech such as background music would need to be eliminated. After a human speech is captured, the effects of the playback sound from a speaker installed for an electronic device may introduce unwanted echo to a keyword verification system. The unwanted echo would also need to be eliminated. Furthermore, a verification procedure is needed to discern whether a user is having a conversation with another user or the user is actually issuing a voice command.
Since the above mentioned challenges would need to be addressed, a novel keyword verification method and a keyword verification system could be needed.