The active appearance model (AAM) techniques were first described by Edwards et al. [1]. They have been extensively used in applications such as face tracking and analysis and interpretation of medical images.
Different derivations of the standard AAM techniques have been proposed for grayscale images in order to improve the convergence accuracy or speed. Cootes et al. proposed in [2] a weighted edge representation of the image structure, claiming a more reliable and accurate fitting than using the standard representation based on normalized intensities. Other derivations include the direct appearance models (DAMs) [3], or the Shape AAMs [4], where the convergence speed is increased by reducing the number of parameters that need to be optimized. In the DAM approach, it is shown that predicting shape directly from texture can be possible when the two are sufficiently correlated. The Shape AAMs use the image residuals for driving the pose and shape parameters only, while the texture parameters are directly estimated by fitting to the current texture.
In [5], a method which uses canonical correlation analysis (CCAAAM) for reducing the dimensionality of the original data instead of the common principal components analysis (PCA) is introduced. This method is claimed to be faster than the standard approach while recording almost equal final accuracy.
An inverse compositional approach is proposed in [6], where the texture warp is composed of incremental warps, instead of using the additive update of the parameters. This method considers shape and texture separately and is shown to increase the AAM fitting efficiency.
Originally designed for grayscale images, AAMs have been later extended to color images. Edwards et al. [7] first proposed a color AAM based on the RGB color space. This approach involves constructing a color texture vector by merging concatenated values of each color channel. However, their results did not indicate that benefits in accuracy could be achieved from the additional chromaticity data which were made available. Furthermore, the extra computation required to process these data suggested that color-based AAMs could not provide useful improvements over conventional grayscale AAMs.
Stegmann et al. [8] proposed a value, hue, edge map (VHE) representation of image structure. They used a transformation to HSV (hue, saturation, and value) color space from where they retained only the hue and value (intensity) components. They added to these an edge map component, obtained using numeric differential operators. A color texture vector was created as in [7], using instead of R, G, and B components the V, H, and E components. In their experiments they compared the convergence accuracy of the VHE model with the grayscale and RGB implementations. Here they obtained unexpected results indicating that the RGB model (as proposed in [7]) was slightly less accurate than the grayscale model. The VHE model outperformed both grayscale and RGB models but only by a modest amount; yet some applicability for the case of directional lighting changes was shown.