Many video processing applications require segmentation of video objects—that is, the differentiation of legitimately moving objects from the static background scene depicted in a video sequence. Such applications include, for example, video mosaic building, object-based video compression, object-based video editing, and automated video surveillance. Many video object segmentation algorithms use video scene background models (or simply background models) as an aid. The general idea is that each frame of a video sequence can be registered to the background model and compared—pixel by pixel—to the model. Pixels which display sufficient difference are considered foreground, or moving, pixels. There are many variations on this theme, which account for a wide range of phenomena such as:                Unstable backgrounds—such as rippling water, blowing leaves, etc.        Lighting phenomena—such as clouds moving across the sun, shadows, etc.        Camera phenomena—such as AGC, auto iris, auto focus, etc.Using this technique (or a variation of it), it is usually possible to detect objects, or parts of objects that exhibit independent motion. There are two basic problems that arise when objects in the scene are stationary for a long period of time (to the point where they might be considered background changes), as demonstrated in FIG. 1:        If an object remains stationary for a long period of time, it could be “permanently” detected as a foreground object when, for all practical purposes, it has become part of the background.        If an object, initially stationary, is part of the background model (gets “burned in”) and then moves, it will expose a region of static background that has not been modeled and will thus be erroneously detected as foreground.Either of these phenomena can degrade the performance of video object segmentation for any application.        
As discussed, for example, in U.S. patent application Ser. Nos. 09/472,162 and 09/609,919 (currently pending, filed, respectively, on Dec. 27, 1999 and Jul. 3, 2000, commonly assigned, and incorporated herein by reference in their entireties), when building photo mosaics, video mosaics, or video scene models, it is often desirable to extract those portions of the source images that represent “true” background. For example, a parked car in a video clip (or any other collection of images) that remains parked for the duration of the clip may be considered true background. But a car in a video clip that is initially parked and later drives away at some point in the clip must be considered “not background.”
If care is not taken to identify true background regions, artifacts will result. If the goal is to produce a mosaic or background image, foreground objects can be “burned in,” resulting in unnatural-looking imagery. If the goal is to build a scene model as a basis for video segmentation, the results can be poor segmentations, where parts of foreground objects are not detected, whereas some exposed background regions are detected as foreground. FIG. 2 shows an example of the results of allowing foreground components to corrupt the scene model.