The present invention relates generally to machine-human interface and in particular the present invention relates to motion detection.
An increasing interest in the recognition of human motion and action using computer vision has appeared, with much emphasis on real-time computability. In particular, tracking/surveillance systems, human computer interfaces, and entertainment domains have a heightened interest in understanding and recognizing human movements. For example, monitoring applications may provide a signal only when a person is seen moving in a particular area (perhaps within a dangerous or secure area). Interface systems may be desired which understand gestures as a means of input or control, and entertainment applications may want to analyze the actions of the person to better aid in the immersion or reactivity of the experience.
In prior work, a real-time computer vision representation of human movement known as a Motion History Image (MHI) was presented. The MHI is a compact template representation of movement originally based on the layering of successive image motions. The recognition method presented for these motion templates used a global statistical moment feature vector constructed from image intensities, resulting in a token-based (label-based) matching scheme. Though this recognition method showed promising results using a large database of human movements, no method has yet been proposed to compute the raw motion information directly from the template without the necessity of labeling the entire motion pattern. Raw motion information may be favored for situations when a precisely labeled action is not possible or required. For example, a system may be designed to respond to leftward motion, but may not care if it was a person, hand, or car moving that generated the motion.
Further, detecting multiple motions in a motion history image can be difficult. That is, multiple motions in different directions can mask the individual motions.
For the reasons stated above, and for other reasons stated below which will become apparent to those skilled in the art upon reading and understanding the present specification, there is a need in the art for a system and method to detect multiple motions.
The above mentioned problems with detecting multiple motions and other problems are addressed by the present invention and will be understood by reading and studying the following specification.
In one embodiment, a method of detecting motion comprises obtaining a plurality of images of an object over a predetermined period of time, generating a motion region image of the object from the plurality of images, tracking regions of the motion region image from a current region to an older region, and identifying movement based on the tracking regions of the motion region image.
In another embodiment, a method of detecting motion comprises obtaining a plurality of images of an object using a camera. The plurality of images are obtained over a predetermined sliding window of time. A motion history image of the object from the plurality of images is generated by isolating (segmenting) the object in the plurality of images from the background. The object may be segmented from the background in many ways, including differences from a known background scene, frame by frame differences to catch object motion, or by detecting the color and/or texture of a known object against a background. Down fill operations are performed on the motion history image, wherein each down fill operation tracks regions of the motion history image from an area on the current region to an older region, and labels contents of each down fill region with a unique identifier. Up fill operation(s) may be performed within each down fill region of the motion history image, wherein the up fill operation goes from an older region to the current region, and labels contents of each up fill region with a unique identifier.
Movement is identified based on the down fill operation and possibly one or more up fill operations.
In another embodiment, a computer readable medium comprises instructions to instruct a processor to perform the method of: obtaining a plurality of images of an object over a predetermined period of time; generating a motion region image of the object from the plurality of images; tracking regions of the motion region image from a current region to an older region; and identifying movement based on the tracking regions of the motion region image.