The segmentation of video sequences into different objects and/or regions is an important task in numerous applications, ranging from video processing, coding, retrieval, and indexing, to object tracking and detection, surveillance, scene analysis, and multimedia content editing and manipulation, among others. Depending on the application, the segmentation may be based on different criteria, such as, for example, color, texture, motion, or a combination thereof. In the case of motion-based segmentation, the goal is to find regions that are characterized by a coherent motion. Doing so presents a challenge, as accurate estimation of motion in different regions requires a good segmentation, and a good segmentation cannot be obtained without accurate motion estimates.
A promising motion-based segmentation technique that has received significant attention formulates the problem as an energy minimization within a maximum a-posteriori, Markov random field (“MAP-MRF”) framework. Pixels are labeled in different classes and a motion cost function is computed and optimized to segment a given frame according to the pixels motion. Special attention must be paid to avoid misalignment of motion and actual object boundaries. For example, pixels in a flat region may appear stationary even if they are moving and/or erroneous labels may be assigned to pixels in covered or uncovered regions due to occlusion. As with any motion-based segmentation, the success of the MAP-MRF framework is closely tied to the accuracy of the estimated motion.