Video matting aims at extracting moving foreground object, and ensuring a good temporal and spatial consistency. As an important technical problem in the field of computer vision technology, video matting is widely used in the fields of hair modeling, defogging, etc. In recent years, many matting methods have been proposed successively to achieve extracting high quality foreground object in complex video and image.
Since sparse representation is widely used in the fields of face recognition, image classification, image restoration and video denoising etc., Jubin et al proposed an image matting method based on the sparse representation, which reconstructs an original image with foreground pixels of a whole video and estimates opacity α (alpha) values of pixels according to a sum of coefficients corresponding to each pixel in a sparse representation coefficient matrix. The method can select appropriate sample points to reconstruct the original image automatically, however, it fails to guarantee similar α values of pixels possessing similar characteristics, therefore fails to guarantee the temporally and spatially consistency of video alpha matte. Furthermore, since only foreground pixel is used as a dictionary, the representative ability is poor, leading to a poor quality of the foreground object extracted by applying said method.
X. Chen and Q. Chen et al proposed a method of introducing non-local prior to obtain video alpha matte, which improves extraction quality by constructing non-local structure of video alpha matte. When implementing said method, a fixed number of sample points are selected directly for each pixel to reconstruct said pixel. However, selecting less sample points will lead to missing of good sample points, meanwhile selecting excessive sample points will lead to noise. Furthermore, it is difficult to construct a consistent non-local structure for pixels possessing similar characteristics, which may result in temporal and spatial inconsistency of video alpha matte, therefore the quality of a foreground object extracted by adopting said method is poor.
The above two methods, when processing video foreground object extraction, have many shortcomings which lead to that the quality of extracted background object is poor, therefore, it is necessary to propose a new solution to improve the quality of the extracted foreground object.