Instead of motion capture or whole-body replacement, face replacement has been used in film production to achieve realistic replacement results. Face replacement is also applicable to social media, virtual, or direct personal interactions such as online video chats.
While face replacement in photographs can easily achieve realistic results, face replacement in video is still a challenging problem, in part due to large appearance variations caused by, light conditions, viewing angles, body poses and mutual occlusions, as well as the different perceptual sensitivity to both the static and dynamic elements of faces. Existing methods for video face replacement mainly focus on two aspects: facial motion capture, and face editing in images. However, to capture the facial motion in video, current systems usually require complex and expensive hardware to get a 3D-morphable model. Such face editing based methods rely on blending the source face into the target face and do not make full use of available temporal information in video sequence.
Another problem often associated with face replacement is that of face alignment. Face alignment aims at locating facial key points given a 2D image. As with face replacement, large variations in poses, expressions and lighting conditions provide challenges. Available approaches to improving face alignment include use of Active Shape Models (ASM) and Active Appearance Models (AAM) that model the face shape and appearance by optimization approaches, such as Principal Component Analysis (PCA). However, while these methods can achieve promising results on certain datasets, their performance severely degrades on other more challenging image datasets.
Other approaches include cascade regression-based methods. Using shape indexed features, Cascade Pose Regression (CPR) and Explicit Shape Regression (ESR) progressively regress the shape stage by stage over the cascade random fern regressors, which are sequentially learnt. Supervised Descent Method (SDM) cascades several linear regression models and achieves the superior performance with the shape indexed SIFT features. Robust Cascade Pose Regression (RCPR) improves CPR with enhanced the shape indexed features and more robust initializations. Local Binary Feature (LBF) is learnt for highly accurate and fast face alignment. Furthermore, Coarse-to-Fine Shape Searching (CFSS) can achiev high accuracy by utilizing a coarse-to-fine shape searching method.