A wide variety of different techniques are known for processing foreground information in images and image video sequences. Such techniques can produce acceptable results when applied to high-resolution images, such as photographs or other two-dimensional (2D) images. However, many important machine vision applications utilize depth maps or other types of three-dimensional (3D) images generated by depth imagers such as structured light (SL) cameras or time of flight (ToF) cameras. Such images are more generally referred to herein as depth images, and may include low-resolution images having highly noisy and blurred edges.
Conventional foreground processing techniques generally do not perform well when applied to depth images. For example, these conventional techniques often fail to differentiate with sufficient accuracy between foreground static objects and one or more moving objects of interest within a given depth image. This can unduly complicate subsequent image processing operations such as feature extraction, gesture recognition, automatic tracking of objects of interest, and many others.