A statistical language model shows the probability of appearance of a word in a text (ease of appearance of a word in a text). A statistical language model is used in a variety of fields, such as speech recognition and machine translation (automatic translation). A technique for generating this statistic language model is described in, for example, Non-Patent Document 1.
As a technique for generating a statistic language model, there are a wide variety of techniques, such as a technique of classifying words included in a text in accordance with a predetermined criterion and estimating the ease of appearance of a word with higher accuracy based on the classified sets, and a technique of, when estimating the probability of appearance of a word, executing a smoothing process in a case that a highly reliable value cannot be obtained.
Further, it is possible to generate a plurality of statistical language models by combining these techniques. Therefore, a process of selecting or generating a statistical language model from the generated statistical language models is needed. Thus, a technique of evaluating a degree that a statistic language model shows a text based on an indicator called perplexity is known (e.g., Non-Patent Document 1).
Perplexity is an indicator based on the entropy of a language that can be obtained when a language is considered as an information source for generating words. Perplexity is equivalent to an average value of the number of words that can be generated in a certain location in a text.
For example, a perplexity calculation device using the technique described in Non-Patent Document 1 calculates the perplexity to a predetermined text based on an equation 1 and an equation 2 in the case of using a bigram model (Non-Patent Document, p. 58) as a statistical language model (Non-Patent Document 1, p. 37, etc.). In the equation 1 and the equation 2, wi denotes a word included in a text, P(wi|wi-1) denotes the probability of appearance of the word (ease of appearance of the word), log2P(wi|wi-1) denotes a degree of ease of word appearance, N denotes the total number of words included in the text, and H denotes the entropy of a language obtained from the ease of appearance of the words included in the text.
                              P          ⁢                                          ⁢          P                =                  2          H                                    [                  Equation          ⁢                                          ⁢          1                ]                                          H          =                                    -                              1                N                                      ⁢                                          ∑                                  i                  =                  1                                N                            ⁢                                                          ⁢                                                log                  2                                ⁢                                  P                  ⁡                                      (                                                                  w                        i                                            ❘                                              w                                                  i                          -                          1                                                                                      )                                                                                      ⁢                                                      [                  Equation          ⁢                                          ⁢          2                ]                [Non-Patent Document 1] Kenji Kita, “Language and Calculation—4 Statistical Language Model,” University of Tokyo Press, Nov. 25, 1999
However, the abovementioned perplexity calculation device calculates perplexity not based on word importance representing the degree of importance of a word but based on a degree of ease of word appearance based on the probability of appearance of a word (ease of appearance of a word).
Therefore, the abovementioned perplexity calculation device cannot calculate perplexity on which word importance is reflected. Thus, for example, there is a problem that it is impossible to, with respect to words having relatively higher word importance, appropriately evaluate a degree that a statistical language model shows a text. Moreover, for example, there is a problem that, in the case of selecting or generating a statistic language model based on the calculated perplexity, the processing accuracy of language processing using a statistical language model becomes low with respect to words having relatively high word importance.