1. Field
The present disclosure generally relates to an application of a speech operation, in particular, to a method, a system, and a computer readable recording medium for filtering out speech interference during a speech operation.
2. Description of Related Art
A conventional speech recognition system emphasizes on distinguishing between speech and non-speech contexts from a voice input. That is, such speech recognition system mainly distinguishes between actual noises such as ambient noises or sudden noises (i.e. a clashing sound) and actual speech activities. The adapted method is based on signal processing such that the differences between acoustic models (eg. a zero-crossing ratio, energy, a spectral distribution or a pitch contour) of noises and speeches are analyzed, and it is equivalent to attribute detection on voices. When a speech activity region is detected, a process such as speech recognition is performed on the whole speech, wherein the speech recognition system only perform the recognition on the whole speech region once. The recognized result may be used as an instruction to control an electronic device so as to realize a speech operation.
However, in a usage situation of some speech recognition mechanisms required to be turned on continuously, contents of a conversation between a user and other people may also be recognized. If the user speaks out the instruction related to the contents of the instruction for controlling the electronic device during the conversation, then the instruction may be output to the electronic devices by the system. However, it is not the user's intention to perform the instruction on the electronic device, and therefore, the user may feel troublesome when the electronic device correspondingly reacts to the received instruction.