1. Field of the Invention
The present invention relates to a hierarchical latent variable model estimation device and a hierarchical latent variable model estimation method for estimating a hierarchical latent variable model for multivariate data, and a computer-readable recording medium having recorded thereon a program for estimating a hierarchical latent variable model for multivariate data.
2. Description of the Related Art
Data typified by sensor data acquired from a car, sales performance of a store, electricity demand history, and the like is data observed according to various factors and accumulated. For example, sensor data acquired from a car varies depending on driving mode. Thus, the data is accumulated as observation values resulting not from one factor but from various factors.
Analysis of factors from which such data results can be applied to industrially important situations. As an example, analyzing a cause of trouble of a car enables quick repair of the car. As another example, analyzing correlations between sales and weather and/or time of day enables reduction of stockout or overstock. As yet another example, recognizing an electricity demand pattern enables prevention of excess or shortage of electricity.
Moreover, if it is possible to analyze how switching between the plurality of factors is made, prediction can be performed by combining knowledge obtained for each factor. Besides, their switching rule can be used as knowledge for marketing, too. Such analysis is therefore applicable to more sophisticated situations.
To separate the above-mentioned data resulting from the plurality of factors on a factor-by-factor basis, a mixture latent variable model is typically used in modeling. As a model including the above-mentioned switching rule, a hierarchical latent variable model is proposed (for example, see Non Patent Literature (NPL) 1).
In order to use such a model, it is necessary to determine the number of hidden states, the type of observation probability distribution, and distribution parameters. In the case where the number of hidden states and the type of observation probability distribution are known, the parameters can be estimated through the use of, for example, an expectation maximization algorithm described in NPL 2. Hence, how to determine the number of hidden states and the type of observation probability distribution is important.
The problem of determining the number of hidden states and the type of observation probability is commonly referred to as a “model selection problem” or a “system identification problem”, and is an extremely important problem for constructing a reliable model. Various methods for determining the number of hidden states and the type of observation probability are accordingly proposed.
As a method for determining the number of hidden states, for instance, there is proposed a method of maximizing variational free energy by a variational Bayesian method (for example, see NPL 3). As another method for determining the number of hidden states, there is proposed a nonparametric Bayesian method using a hierarchical Dirichlet process prior distribution (for example, see NPL 4).
Further, a method for determining the type of observation probability by approximating, for a mixture model which is a typical example of a latent variable model, a complete marginal likelihood function and maximizing its lower bound (lower limit) is described in NPL 5.