It is important for an information retrieval apparatus which receives the recognition result of speech (utterance) uttered by a user to correctly recognize a phrase (keyword) effective for narrowing down information which matches his intension. For example, an apparatus which searches for a television program narrows down programs using the program name, cast name, or the like as keywords. If the apparatus mistakes a keyword contained in user's utterance, it may provide, as a search result, a program different from one the user wants to view, because programs are narrowed down based on the wrong phrase.
One method for recognizing a keyword at high accuracy uses the type of a keyword contained in an utterance as a language restriction. There has conventionally been proposed a method of automatically extracting a named entity from a speech recognition result in order to identify the type of a keyword contained in an utterance. A technique concerning this method is described in, for example, reference 1 “Japanese Patent Laid-Open No. 2004-184951”.
The technique described in reference 1 is a named entity class identification method using a language model learned using a text with a named entity class. As shown in FIG. 11, a named entity class identification apparatus according to this technique generates a word graph with a named entity class from a speech recognition result. By using a morphological language model with a named entity class, the named entity class identification apparatus outputs a morpheme sequence with a named entity class that maximizes the total probability.
There is also a method of suppressing a decrease in extraction accuracy caused by the influence of a speech recognition error when extracting a named entity contained in a speech recognition result. For example, the speech recognition confidence is used as the feature of a discriminative model for extracting a named entity in reference 2 “Sudoh et al., ‘Incorporating Speech Recognition Confidence into Discriminative Models for Named Entity Recognition of Speech’, Proceedings of 1st Spoken Document Processing Workshop, Media Science Research Center in Toyohashi University of Technology, Japan, 2007”.