A video is a sequence of images. The images are often referred to as frames. The terms ‘frame’ and ‘image’ are used interchangeably throughout this specification to describe a single image in an image sequence, or a single frame of a video. An image is made up of visual elements. Visual elements may be, for example, pixels, or blocks of wavelet coefficients. Visual elements may also be expressed in the frequency domain, such as by 8×8 DCT (Discrete Cosine Transform) coefficient blocks, as used in JPEG images, or 32×32 DCT-based integer-transform blocks as used in AVC or H.264 coding.
Image processing on a pixel level is often slow for mega-pixel or bigger images. By grouping connected pixels that share common properties such as intensity, colour or texture into a superpixel, the image can advantageously be analysed at a higher granularity level and consequently at a faster processing speed (or, alternatively, with fewer processing resources). Partitioning or separation of an image into such groups of pixels is known as superpixel segmentation. Superpixel segmentation is a pixel-based image segmentation method involving selection of initial seed points. This approach to segmentation examines neighbouring pixels of initial “seed points” and determines whether the pixel neighbours should be added to the region. The process is iterated on, until the optimum segmentation of the entire image is reached.
Change detection in video is an important low-level task for many computer vision applications. Changed pixels are classified as ForeGround (FG) and unchanged pixels are classified as BackGround (BG). By aggregating information from previously reproduced frames of a sequence, a background model of the captured scene can be constructed. An incoming frame is then compared against this background model for change detection.
Scene modelling, also known as background modelling, involves modelling the visual content of a scene, based on an image sequence depicting the scene. A common usage of scene modelling is foreground separation by background subtraction. Foreground separation is also described by its inverse; background separation. Examples of foreground separation applications include activity detection, unusual object or behaviour detection, and scene analysis.
There are pixel-based and block-based approaches for background modelling. The pixel-based approaches are very sensitive to non-stationary scenes or background. Block based approaches are less sensitive to local movement and are therefore more capable of dealing with non-stationary background. However, block-based approaches usually produce a Low-Resolution (LR) foreground detection. In some applications, such as green screening and object re-identification, it is important to have the foreground mask at pixel-level accuracy.