Recently, as multimedia content such as voice and video expands and prevails, accurate multimedia search technology is demanded. With regard to voice search, there is ongoing research about voice search technology that identifies a location where the voice corresponding to a given search word (query) is spoken. In voice search, because of characteristic issues such as the difficulty of voice recognition, a search technique with sufficient performance has not been established compared to string search that identifies a location that includes a desired search word within a string. For this reason, various technologies for realizing voice search of sufficient performance are being researched.
For example, Non-Patent Literature 1 (Keisuke Iwami, Nagisa Sakamoto, Seiichi Nakagawa, “Strict Distance Measure for a Spoken Term Detection Method Based on a Syllable n-gram Index with Distance Metric”, IPSJ Journal, Vol. 54, No. 2, 495-505, (2013.2)) discloses a technique for voice search using a voice recognition result as a base that robustly search voice while taking into account problems such as unknown words and recognition errors.