The present invention relates generally to the field of signal processing systems, and more particularly to image signal processing and other types of signal processing techniques involving the use of hidden Markov models (HMMs).
A variety of well-known signal processing techniques make use of HMMs. For example, HMMs have been used in the analysis of image sequences to recognize facial expressions. Facial expressions are complex, spatio-temporal motion patterns. The movements associated with a given facial expression are generally divided into three periods: (i) onset, (ii) apex, and (iii) offset. These periods correspond to the transition towards the facial expression, the period sustaining the peak in expressiveness, and the transition back from the expression, respectively. The rate of change in the onset period as well as the duration in the apex period are often related to the intensity of the underlying emotion associated with the facial expression.
HMMs have been specifically designed in this context to take advantage of the spatio-temporal character of facial expression patterns. Examples of facial expression recognition techniques based on HMMs are described in T. Otsuka et al., xe2x80x9cRecognizing Abruptly Changing Facial Expressions From Time-Sequential Face Images,xe2x80x9d International Conference on Computer Vision and Pattern Recognition (CVPR), 1998; and T. Otsuka et al., xe2x80x9cRecognizing Multiple Personsxe2x89xa6 Facial Expression Using HMM Based on Automatic Extraction of Significant Frames from Image Sequences,xe2x80x9d International Conference on Image Processing (ICIP), pp. 546-549, 1997.
The HMMs used in the above-described facial expression context as well as in other signal processing applications may be sequential HMMs. Sequential HMMs, which are also known as left-to-right HMMs, are typically used to model sequential data for pattern recognition, analysis, etc. The sequential data generally represent linear trajectories in multi-dimensional spaces. For example, the sequential data may represent the path followed by a set of facial feature points in an abstract multi-dimensional space of facial expressions, when a face is observed over time for facial expression recognition.
A significant problem which arises in the conventional use of sequential HMMs and other types of HMMs in signal processing applications is related to the determination of an appropriate number of states for the HMM. In general, it is necessary to specify the number of states in the HMM prior to using the HMM to process actual data. Unfortunately, existing techniques for determining the number of states are deficient in that such techniques are generally unable to provide a model which best matches a given set of training data.
The invention provides methods and apparatus for determining an appropriate number of states in a hidden Markov model (HMM) using an iterative algorithm which adjusts the number of states based on an inter-state closeness measure. The resulting HMM is utilized to process data in a signal processing system and an action in the system is taken based on a result of the processing operation.
In accordance with one aspect of the invention, a signal processing system processes a signal using an HMM having a number of states determined at least in part based on application of an iterative algorithm to the model. The iterative algorithm adjusts the number of states of the HMM starting from an initial or default number of states, based at least in part on closeness measures computed between the states, until the HMM satisfies a specified performance criterion. For example, the iterative algorithm may iteratively increase or decrease the number of states until an average separation between the states is within a predefined range. The HMM having the adjusted number of states is then utilized to determine a characteristic of the signal, and an action of the signal processing system is controlled based on the determined characteristic.
In an illustrative embodiment, the signal to be processed using the HMM having an adjusted number of states is a sequence of images, and the HMM is used to determine an intensity of a particular facial expression likely to be present in the sequence of images.
In accordance with another aspect of the invention, the inter-state closeness measure is in the form of a mutual entropy computed along a line that passes through a pair of points, each of the points representing a most likely point in a feature space associated with a corresponding state of the HMM.
In accordance with a further aspect of the invention, a first iterative algorithm is used to adjust the number of states of the HMM if an expected number of states of the HMM is above a specified number, and a second iterative algorithm is used to adjust the number of states of the HMM if the expected number of states of the HMM is at or below the specified number. The specified number of states may be on the order of ten states. The first iterative algorithm may be configured to perform local closeness tests and to allow multiple states to be added and deleted from the model on each iteration. The second iterative algorithm may be configured to perform a global closeness test and to allow only one state to be added or deleted from the model on each iteration.
Advantageously, the invention allows the determination of an appropriate number of states in a HMM based on a set of training data, such that the resulting model performs with substantially greater accuracy than a model generated by assignment of a fixed number of states. The techniques of the invention can be used in a wide variety of signal processing applications, including video-camera-based systems such as video conferencing systems and video surveillance and monitoring systems, speech recognition systems, and human-machine interfaces.