Computer models provide the basis for many analytical systems especially in the area of artificial intelligence. Previous work in this area has explored the fusion of multiple sources of information to reason about higher-level abstractions relating to human context in computer systems operations. Recent work on probabilistic models employed for reasoning about a user's location, intentions, and focus of attention have highlighted opportunities for building new kinds of applications and services. A portion of such work leverages perceptual information to recognize human activities and is centered on identification of a specific type of activity in a particular scenario. Many of these techniques are targeted at recognizing single, simple events (e.g., ‘waving a hand’ or ‘sitting on a chair’). However, less effort has been applied to methods for identifying more complex patterns of human behavior, extending over longer periods of time.
Dynamic models of periodic patterns associated with people movements are employed by some methods to capture periodicity of activities such as walking. Other approaches to human activity recognition employ graphical models. A significant portion of work in this area has made use of Hidden Markov Models (HMMs). As an example, HMMs have been utilized for recognizing hand movements to relay symbols in American Sign Language, wherein different signs can be recognized by computing probabilities for models that different symbols may produce given an observed visual sequence. More complex models, such as Parameterized-HMM (PHMM), Entropic-HMM, Variable-length HMM (VHMM), and Coupled-HMM (CHMM), have been utilized to recognize more complex activities such as the interaction between two people. One method proposes a stochastic, context-free grammar to compute the probability of a temporally consistent sequence of primitive actions recognized by HMMs.
Other methods model events and scenes from audiovisual information. For example, these methods have developed a wearable computer system that utilizes a plurality of HMMs for recognizing the user's location, (e.g., in the office, at the bank, and so forth). Still yet other techniques propose an entropic-HMM approach to organize observed video activities (e.g., office activity and outdoor traffic) into meaningful states, wherein the models can be adapted in video monitoring of activities such as outdoor traffic, for example. In another approach, a probabilistic finite-state automaton (a variation of structured HMMs) is employed for recognizing different scenarios, such as monitoring pedestrians or cars on a freeway. Although standard HMMs appear to be robust to changes with respect to temporal segmentation of observations, they tend to suffer from a lack of structure, an excess of parameters, and an associated over-fitting of data when applied to reason about long and complex temporal sequences with inadequate training data. In recent years, more complex Bayesian networks have been adopted for modeling and recognition of human activities. To date, however, there has been little advancement on methods for exploiting statistical processes to fuse multiple sensory streams that address modeling problems with robustness and mitigate model training efforts.