A speech recognition device that, from a speech (an utterance) uttered by a user, recognizes a word sequence represented by the speech is known. As one of this kind of speech recognition device, a speech recognition device described in Patent Document 1 executes a speech recognition process of recognizing a word sequence corresponding to a speech based on a plurality of previously stored content-specific language models.
A content-specific language model is a model representing a probability that a specific word appears in a word sequence representing a specific content (a topic, a keyword, or the like). For example, a probability that the name of a program or the name of a personality appears is high in a word sequence about a TV program, and a probability that the name of a team, the name of sporting goods or the name of a player appears is high in a word sequence about sports.
There is a case that the content changes in a series of speeches uttered by a user. In this case, if a speech recognition process is executed based on only one content-specific language model, there is a fear that the accuracy of recognition of a word sequence become extremely low.
Accordingly, the speech recognition device described above is configured to use content-specific language models different for each predetermined section in one utterance.    [Patent Document 1] Japanese Unexamined Patent Application Publication No. JP-A 2002-229589
However, the speech recognition device described above has a problem that, in a case that the content of a content-specific language model used in the abovementioned section does not coincide with the content of an actual utterance, the accuracy of recognition of a word sequence becomes extremely low.
Further, for determining which one of the content-specific language models should be used, the speech recognition device executes a process of evaluating the result of recognition when using each of the content-specific language models. Therefore, the speech recognition device described above has a problem that processing load for determining which one of the content-specific language models should be used is excessively large.