1. Field of the Invention
The present invention relates to latent variable model estimation apparatus and method for multivariate data having sequential dependence, and a computer-readable recording medium in which a latent variable model estimation program is recorded, particularly to latent variable model estimation apparatus and method for approximating a model posterior probability to maximize a lower bound of the model posterior probability, thereby estimating a latent variable model of multivariate data having sequential dependence, and a computer-readable recording medium in which a latent variable model estimation program is recorded.
2. Description of the Related Art
There are various pieces of data having sequential dependence. Examples of the data having the sequential dependence include data having temporal dependence, a text depending on a character sequence, and genetic data depending on a base sequence.
Pieces of data typified by sensor data acquired from an automobile, a laboratory data history of a medical checkup, and an electric demand history are multivariate data having the “sequential dependence (in the example, temporal dependence)”. An analysis of the data is applied to many industrially important fields. For example, it is conceivable that a breakdown cause of the automobile is analyzed to implement a quick repair by analyzing sensor data acquired from the automobile. It is also conceivable that an estimation of a risk of a disease and prevention of the disease can be implemented by analyzing the laboratory data history of the medical checkup. It is also conceivable that the electric demand is predicted to prepare for excess or deficiency by analyzing the electric demand history.
Generally such pieces of data are modeled using a latent variable model (for example, hidden Markov model) having the sequential dependence. For example, in order to use the hidden Markov model, it is necessary to decide a latent state number, a kind of an observation probability distribution, and a distribution parameter. In the case that the latent state number and the kind of the observation probability distribution are found, the parameter can be estimated using an expectation maximization method (for example, see Non Patent Literature (NPTL) 1).
A problem that the latent state number or the kind of the observation probability is found is generally called a “model selection problem” or a “system identification problem”, and is an important problem to construct a reliable model. Therefore, various technologies are proposed.
For example, NPTL 2 proposes a method for maximizing variational free energy by a variational Bayesian method, as a method for deciding the latent state number. For example, NPTL 3 proposed a non-parametric Bayesian method, in which a hierarchical Dirichlet process prior distribution is used, as the method for deciding the latent state number.
In NTPL 4, a complete marginal likelihood function is approximated to a mixed model that is of a representative example of a latent variable model independently of the temporal dependence, and its lower bound is maximized.