Word spotting is a technology to extract, by the use of speech recognition, portions of audio sound in which one of plural words registered in a dictionary is spoken from the audio sound represented by audio data. For example, by registering only a search target word in a dictionary, the technology enables a portion of audio sound where the search target word is spoken to be extracted. Accordingly, the technology may be utilized for information search on audio sound. However, unlike a typical character string search on text data, there is the possibility of having a recognition error because the waveform may differ for different speakers even if they speak the same word.
With regard to the speech recognition, various technologies are proposed to improve the recognition ratio. For example, there is a known technology that generates words which are similar to recognition target vocabulary but prone to cause recognition errors in the phoneme level, and uses those generated similar words as rejecting vocabulary. Furthermore, Japanese Laid-open Patent Publication Nos. 2003-330491 and 2006-154658 disclose technologies that evaluate the possibility of recognition error by analyzing a speech-recognized word and limit the number of rejecting words to be generated as the possibility of recognition error of the speech-recognized word becomes higher.