The exemplary embodiment relates to the modeling of independent complex systems over time. It finds particular application in the detection of soft failures in device infrastructures and will be described with particular reference thereto. However, it is to be appreciated that the exemplary embodiment is also applicable to the modeling of other systems based on discrete observations.
There are many applications where it is desirable to decompose one or more complex processes, called observations, into a small set of independent processes, called sources, where the sources are hidden. The following tasks are of particular interest:
inference: given a sequence of observations and knowing the link between sources and observations, find the values of the sources that best fit what has been observed.
learning: given a large number of observations, find the link between sources and observations that best fits the observations.
One example of this problem is in an infrastructure of shared devices, such as a network of printers, where an administrator has the task of maintaining the shared devices in an operational state. Although some device malfunctions result in the sending of a message by the device itself, or result a catastrophic failure which is promptly reported by a user, other malfunctions, known as “soft failures” are not promptly reported to the administrator. This is because the device does not become unavailable, but rather suffers a malfunction, degradation, improper configuration, or other non-fatal problem. When a particular device undergoes a soft failure, the pattern of usage of the device often changes. Users who would typically use the device when it is functioning normally, tend to make more use of other devices in the network which provide a more satisfactory output. Since soft failures result in productivity losses and add other costs to the operators of the network, it is desirable to detect their occurrence promptly.
Statistical models, such as Hidden Markov Models (HMM) and Independent Component Analysis (ICA), have been used to model processes in which some variables are hidden, but are assumed to be statistically related to observed variables. The HMM makes certain assumptions, including that the values of the hidden variables (states) depend only upon previous values of the hidden variables, that the value of each hidden variable is independent of the values of the other hidden variables, and that the values of the observed variables depend only on the current values of the hidden variables. Under these assumptions, a time sequence of values of the hidden variables is inferred from the temporal variation of the observed variable values and knowledge of the parameters of the stochastic process relating the observed variables to the hidden ones. A known extension of the HMM approach is the factorial hidden Markov model (FHMM) described, for example, in Z. Ghahramani and M. I. Jordan, “Factorial Hidden Markov Models,” in David S. Touretzky, Michael C. Mozer, and Michael E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8, pp. 472-478 (MIT Press, 1996), hereinafter, “Ghahramani, et al.”
In FHMM, exact inference has a complexity which is exponential in the number of hidden dynamics, and approximate inference techniques are generally required. Jordan and Ghahramani proposed a variational inference framework to estimate the parameters (See Ghahramani, et al., above, and Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence Saul, “An Introduction to Variational Methods for Graphical Models,” in Michael I. Jordan, ed., Learning in Graphical Models (Kluwer Academic Publishers, Boston, 1998), hereinafter, “Jordan, et al.”
Existing FHMM implementations generally operate on observed variables that are continuous. For example, the variational inference framework of Ghahramani, et al. is limited to continuous (Gaussian) observation variables. The hidden states, on the other hand, are assumed to be discrete, and the number of possible states for a given hidden dynamic is an input parameter to the FHMM analysis.
In many practical applications, however, the observed variables are also discrete, such as in the soft failure problem described above. It would be desirable to determine the printer states of printers of a digital network (the hidden parameters) based on observed choices of printer destination made by users (the observed parameters). Each choice of printer destination is a discrete observation limited to N discrete values where N is the number of available printers on the network. It would be advantageous if the set of usage observations could be employed to glean an understanding of the hidden underlying states of the devices in the network using a probabilistic model.
This problem is not limited to networked devices, such as printers, however, but is very general, since it applies to the analysis of any kind of processes producing sequential and high-dimensional discrete data, which must be decomposed into a smaller set of factors to be tractable.
Attempts to analyze FHMM with discrete observations have been less than fully satisfactory. While Jordan, et al. suggests that discrete observations may be accommodated using sigmoid functions, no algorithm is provided. The suggested approach of employing sigmoid functions, even if actually implementable, would have the disadvantage that the sigmoid functions are not well-representative of occurrences of (presumed) independent discrete events such as print job submissions. Other proposed approaches for accommodating discrete observations have been limited to discrete binary observations, and are not readily extendible to observations that can take three or more different discrete values.
There remains a need for an efficient model for solving the inference and learning tasks of FHMMs when the observations are discrete-valued, such as event counts.