The invention relates in general to modeling and in particular to generating hidden Markov models from state and transition data.
In U.S. Pat. No. 6,965,861, the inventors discuss Hidden Markov models (HMMs) as a class of statistical models used in modeling discrete time-series data. Problems that naturally give rise to such data include robot navigation, machine vision, and signal processing, and HMMs are at the core of many state-of-the-art algorithms for addressing these problems. In addition, many problems of natural language processing involve time-series data that may be modeled with HMMs, including: part-of-speech tagging, topic segmentation, speech recognition, generic entity recognition, and information extraction.
The U.S. Patent and Trademark Office database shows more than 1,200 hits for “hidden markov model” as of Nov. 15, 2005. HMM technology appears in numerous fields including and not limited to voice recognition, handwriting recognition, signal processing and genetic engineering. It is a fundamental tool for uncovering state systems within complex data sets of real world phenomena. However, many techniques for arriving at a HMM representative of such complex data are highly empirical. Thus, there is a need for improved methods to generate a HMM from such data sets, to test and/or change the complex systems in accordance with the HMM.
This invention arises from studies of mouse sleep stage data, iterating related art techniques originally designed for studying ion channels (“Maximum likelihood estimation of aggregated Markov processes” Proceedings of the Royal Society B, Vol. 264, No. 1380; pp. 375-383, Mar. 22, 1997). Extending prior art optimizing the parameters of a fixed graph, this invention presents a method to arrive at the “best” or most likely graphical model. This method is a data processing technique to identify hidden Markov model (HMM) state machines in physical, chemical, biological, physiological, social and economic systems. Unlike speech processing prior art, for example, this invention does not choose from a library of pre-determined left-to-right models, or any other library, but determines a new model from each new set of data.
A state machine is a concept that is used to describe a system that transitions from one state to another state and from there back to the original state or into other states, and so on. Dwell time is the time spent in any one state. Dwell times and transitions between states can be observed, but they are often aggregations that cannot be distinguished by limited or indirect observations. The observed state machine may include invisible transitions among indistinguishable states in the same class of aggregated states, or indistinguishable transitions between different members of two aggregated states. In a Markov system, transitions are instantaneous and random; the probability per time unit of a transition at a given time from one state to another ideally depends only on the rate of that transition and the state at that time, and not the history of the system. These transition rates allow otherwise identical states to be distinguished, in that states with different exit transition rates will generally have different dwell time distributions. Observations are made over a period known as an epoch, a frame or a sampling interval, and for each of these a class or aggregated state is assigned. The aggregated states thus can easily be distinguished in histograms of their observed dwell times. Until now, the aggregated transitions weren't in general so easy to distinguish. In fact, some ideal hidden Markov models are indistinguishable by their steady state statistics (“Using independent open-to-closed transitions to simplify aggregated Markov models of ion channel gating kinetics” PNAS 2005 102: 6326-6331, hereinafter “Pearson”).
In reality, the most interesting systems have external inputs, are out of equilibrium, do not have constant transition rates, or are otherwise fundamentally not steady-state, and thus not subject to Pearson's canonical equivalence. For such real systems, graph isomorphism is the only organizing principle; the nonphysical, negative transition rates of Pearson's tortured canonical forms are fortunately avoided, and there isn't much ambiguity in distinguishing models by how they fit real data. This invention identifies “best” hidden Markov models up to isomorphism, i.e., up to a relabeling of the graph that preserves adjacency.
Physiological and biological processes often resemble state machines. For example, the sleep cycle of mice include states identified as rapid eye movement (REM) sleep, slow wave sleep and awake. These three states are readily identified in EEG polysomnography studies and, at first glance, a simple 3 state machine emerges with transitions between all states (except you don't see transitions directly from awake to REM sleep). The transitions occur randomly without apparent outside stimulus and so the state machine can be considered a Markov system. However, histograms of the 3 observed state dwell times indicate that there are multiple hidden states for each of the observed states. How to connect these 6 or more hidden states with hidden transitions is not at all clear and in fact the number of possible connected graphical models increases combinatorically with the number of states and transitions. The hidden Markov model has states and transitions that are not readily apparent from the data but nevertheless are real components of the system that is represented by the Markov model. By uncovering the hidden Markov model, investigators learn more about the underlying processes and are better able to explain the phenomena of studied physical, chemical, biological, physiological, social and economic systems and craft experiments to measure how variables will affect the systems.
Markov models allow the observer to make predictions about possible results if the system is activated in different ways. For example, data from a control Markov system may be compared with data from an experimental Markov system to see if the variables between the control and experimental systems generate changes on the system level, i.e., do they create different states and different transitions between states. Comparing control and experimental Markov systems gives more information about not only the gross differences between the control and experimental system but also the way in which those differences are manifested in the operation of the system. In our analysis of very limited mouse sleep data, for example, we discover plausible wild-type mouse sleep cycles, and that the double knock-out mice have dramatic changes in their sleep models, a result that could not be determined by gross observation of single knock-out mice (see Joho).
Complex systems can be defined by Markov models, but it is difficult to identify the model when there are hidden states. Investigators searching for hidden Markov models often use empirical methods to identify hidden Markov models. However, complex systems will often have a combinatorically increasing number of possible Markov models. In order to evaluate potential hidden Markov models, one must contrast numerous Markov models with every conceivable hidden state and transition between states. For example, for a mouse sleep model with up to 16 degrees of freedom (i.e., up to 8 transitions), the candidate models include all connected graphs of up to 8 edges and up to 9 states from 3 distinct observable classes (colors). There are 762,291 such distinct (nonisomorphic) graphs.