1. Field of the Invention
The present disclosure relates to video analytics, and, more specifically, a system, computer program product and methodology for detecting motion in videos using spatio-temporal slice processing to make the detection illumination-invariant.
2. Description of the Related Art
Video analytics extends beyond encoding and decoding video for desktop computer systems. Video analytics are techniques for analyzing content in still and video images, and are used to do more than determine a proper frame rate, resolution, and the like. In video analytics, image and video are analyzed for content, such as motion of items in an image(s). Often, uninteresting portions of a video are discarded, while other portions are analyzed in detail to extract relevant information, such as motion.
Specifically, motion detection involves identifying spatial regions of images in a video sequence that are moving in an image space. This can include, for example, swaying trees and moving shadows, but not lighting changes or image noise. The problem of motion detection is made more complex by situations where a region of a video, stationary in all preceding frames, suddenly moves from one frame to the next. The inverse problem is also possible, i.e. a region in motion for many frames may stop abruptly.
Many currently available motion detection methods and systems detect change rather than motion. These methods merely search for a difference in a region of an image between frames, or between a current frame and a series of preceding frames. For example, in the background subtraction method, a background or reference image is constructed from a set of previous frames in a video using, for example, an Infinite Impulse Response (IIR) filter. Then detection is performed by applying a threshold to the absolute difference between a current image and the background or reference image. The threshold may be a simple, binary function where the result is a “1” if the difference surpasses the threshold, and “0” otherwise.
Other methods attempt to model the background using, for example, Gaussian distributions having a mean and a variance value. In the Gaussian Mixture Model, each pixel is represented by several Gaussian distributions, and each Gaussian distribution is weighted according to its variance and how many times it is observed. A pixel that does not fit within the background model unless its weight is reduced below a certain threshold is considered a foreground pixel likely to contain motion information.
As recognized by the present inventor, methods that detect change and methods that attempt to model the background image are both very sensitive to changes in lighting. These methods simply cannot distinguish between “true” image changes and lighting effects while processing individual pixels.
Moreover, also as recognized by the present inventor, motion estimation methods, which are usually based on comparing only two consecutive images, such as optical flow, which determine velocities for image regions in motion, are very computationally demanding, inaccurate or both. With more than two consecutive images being analyzed the computational demand is considered to be too high for practical application.
Most currently available spatio-temporal filters analyze images individually and they use a high-level model to handle the temporal aspects, and the few spatio-temporal algorithms that simultaneously process a video spatially and temporally are too computationally demanding for real-time embedded processing.