The invention relates generally to image alignment, and more particularly to a system and method for improved facial alignment via a boosted appearance model.
Image alignment is the process of aligning a template to an image. Specifically, the template may be deformed and moved to minimize the distance between the template and the image. Image alignment models include algorithms designed to optimize (e.g., minimize) the distance between the template and a given image. There are generally three elements to image alignment, including template representation, distance measurement, and optimization. An exemplary template may be a simple image patch or may be a more complex template, such as an active shape model (ASM) and/or an active appearance model (AAM). The least square error (LSE) and mean square error (MSE) between the warped image and the template are exemplary distance metrics. Gradient descent methods (e.g., Gauss-Newton, Newton, Levenberg-Marquardt, and so forth) may be employed for optimization.
In AAM, the template representation uses two eigenspaces to model the facial shape and the shape-free appearance, respectively. For the distance metric, the MSE between the appearance instance synthesized from the appearance eigenspace and the warped appearance from the image observation is minimized by iteratively updating the shape and/or appearance parameters. It is well known that AAM-based face alignment has difficulties with generalization. That is, the alignment tends to diverge on images that are not included as the training data for learning the model, especially when the model is trained on a large dataset. In part, this is due to the fact that the appearance model only learns the appearance variation retained in the training data. When more training data is used to model larger appearance variations, the representational power of the eigenspace is very limited even under the cost of a much higher-dimensional appearance subspace, which in turn results in a harder optimization problem. Also, using the MSE as the distance metric essentially employs an analysis-by-synthesis approach, further limiting the generalization capability by the representational power of the appearance model.
Challenges to image alignment may generally include any differences between unknown images and the training data. For example, pose, race, lighting, expression, occlusion, and/or resolution may impair the accuracy of image alignment. Accordingly, it may be desirable to provide an improved model for image alignment which provides for variances in images.