This invention relates to technology for statistical prediction and, in particular, to technology for prediction based on Bayes procedure.
Conventionally, a wide variety of methods have been proposed to statistically predict a data on the basis of a sequence of data generated from the unknown source. Among the methods, Bayes prediction procedure has been widely known and has been described or explained in various textbooks concerned with statistics and so forth.
As a problem to be solved by such statistical prediction, there is a problem for sequentially predicting, by use of an estimation result, next data which appear after the data sequence. As regards this problem, proof has been made about the fact that a specific Bayes procedure exhibits a very good minimax property by using a particular prior distribution which may be referred to as Jeffreys prior distribution. Such a specific Bayes procedure will be called Jeffreys procedure hereinafter. This proof is done by B.Clarke and A. R. Barron in an article which is published in Journal of Statistical Planning and Inference, 41:37-60, 1994, and which is entitled “Jeffreys prior is asymptotically least favorable under entropy risk”. This procedure is guaranteed to be always optimum whenever a probability distribution hypothesis class is assumed to be a general smooth model class, although some mathematical restrictions are required in strict sense.
Herein, let logarithmic regret be used as another index. In this event also, it is again proved that the Jeffery procedure has a minimax property on the assumption that a probability distribution hypothesis class belongs to an exponential family. This proof is made by J. Takeuchi and A. R. Barron in a paper entitled “Asymptotically minimax regret for exponential families”, in Proceedings of 20th Symposium on Information Theory and Its Applications, pp. 665-668, 1997.
Furthermore, the problem of the sequential prediction can be replaced by a problem which provides a joint (or simultaneous) probability distribution of a data sequence obtained by cumulatively multiplying prediction probability distributions.
These proofs suggest that the Jeffreys procedure can have excellent performance except that the prediction problem is sequential, when the performance measure is the logarithmic loss.
Thus, it has been proved by Clarke and Barron and by Takeuchi and Barron that the Bayes procedure is effective when the Jeffreys prior distribution is used. However, the Bayes procedure is effective only when the model class of the probability distribution is restricted to the exponential family which is very unique, in the case where the performance measure is the logarithmic regret instead of redundancy.
Under the circumstances, it is assumed that the probability distribution model class belongs to a general smooth model class which is different from the exponential family. In this case, the Jeffreys procedure described in above B. Clarke and A. R. Barron's document does not guarantee the minimax property. To the contrary, it is confirmed by the instant inventors in this case that the Jeffreys procedure does not have the minimax property.
Furthermore, it often happens that a similar reduction of performance takes place in a general Bayes procedure different from the Jeffreys procedure when estimation is made by using the logarithmic regret in lieu of the redundancy.