1. Technical Field
This disclosure relates to acquiring detailed facial geometry with high resolution diffuse and specular photometric information from multiple viewpoints.
2. Description of Related Art
Digitally reproducing the shape and appearance of real-world subjects has been a long-standing goal of computer graphics. In particular, the realistic reproduction of human faces has received increasing attention in recent years. Some of the best techniques use a combination of 3D scanning and photography under different lighting conditions to acquire models of a subject's shape and reflectance. When both of these characteristics are measured, the models can be used to faithfully render how the object would look from any viewpoint, reflecting the light of any environment. An ideal process would accurately model the subject's shape and reflectance with just a few photographs. However, in practice, significant compromises are typically made between the accuracy of the geometry and reflectance model and the amount of data which must be acquired.
Polarized spherical gradient illumination has been used for acquiring diffuse and specular photometric information and using it in conjunction with structured light scanning to obtain high resolution scans of faces. In addition to the detail in the reconstructed 3D geometry, photometric data acquired with this technique can be used for realistic rendering in either real-time or offline contexts. However, the technique may have significant limitations. The linear polarization pattern may be effective only for a frontal camera viewpoint, forcing the subject to be moved to different positions to scan more than the front of the face. Also, the lighting patterns may require rapidly flipping a polarizer in front of the camera using custom hardware in order to observe both cross-polarization and parallel-polarization states. The reliance on structured light for base geometry acquisition may add scanning time and system complexity, while further restricting the process to single-viewpoint scanning.
To overcome viewpoint restriction imposed by active illumination, advanced multiview stereo (MVS) may be used to derive geometry from several high-resolution cameras under diffuse illumination. While the geometric detail derived may not be at the level of skin mesostructure, additional detail may be inferred through a “dark-is-deep” interpretation of the diffuse shading, producing geometric detail correlating to skin pores and creases. Just a single set of simultaneous photographs may suffice as input, allowing even ephemeral poses to be recorded. However, the techniques may be limited in that they record only a diffuse texture map to generate renderings rather than separated reflectance components, and the geometric detail inferable from diffuse shading can vary significantly from the true surface detail which is more directly evidenced in specular reflections. Also, the single-shot nature of these techniques may not be required for acquiring most facial expressions, as subjects can typically maintain the standard facial expressions used in building facial animation rigs for the handful of seconds required for multi-shot techniques.
While there has been a wide body of work on 3D scanning of objects, scanning of human faces can present specific challenges in obtaining high-quality geometry and reflectance information. There are high resolution techniques for scanning static facial expressions based on laser scanning a plaster cast, such as the scans performed by XYZRGB, Inc. However, such techniques may not be well suited for scanning faces in non-neutral expressions and may not capture reflectance maps.
Real-time 3D scanning systems exist that are able to capture dynamic facial performances. These methods may rely on structured light; unstructured painted face texture, or use photometric stereo. However, these methods may be limited: they may not provide sufficient resolution to model facial details, they may assume uniform albedo, or they may be data-intensive. An alternate approach is to first acquire a detailed static scan of the face including reflectance data, augmenting it with traditional marker-based facial motion-capture data for large scale deformation, and integrate high resolution video data for medium scale expressive wrinkles.
There are also passive multiview face scanning systems which exploit detail in the observed skin texture under diffuse illumination in order to reconstruct high resolution face scans. While achieving impressive qualitative results for geometry reconstruction, these techniques may rely on synthesis of mesoscopic detail from skin texture that may differ from true surface detail. Furthermore, these techniques may not capture specular reflectance maps which may be useful for realistic rendering.
At the other end of the spectrum, dense lighting and viewpoint measurements have been employed to capture detailed spatially varying facial reflectance. However, such techniques may be data intensive and may not scale well for scanning of non-neutral facial expressions and dynamic facial performances.
There is also a technique for high resolution face scanning of static expressions based on photometric surface normals computed from spherical gradient illumination patterns. They may capture separate photometric albedo and normal maps for specular (surface) and diffuse (subsurface) reflection by employing polarization of incident lighting. Photometric normals—in particular the detailed specular normals—may be used to add fine-scale detail to base geometry obtained from structured light. However, a linear polarization pattern may limit the acquisition to a single viewpoint providing limited coverage of the scanned subject.
Other work has extended the technique for capture of dynamic facial performance using high speed photography, as well as moderate acquisition rates using joint photometric alignment of complementary gradients. This technique has been applied to acquiring facial performance from multiple viewpoints. However, the technique may be limited to acquiring unpolarized data for viewpoint independence and employing heuristic post-processing for diffuse-specular separation.
View independent separation of diffuse and specular reflectance may be used by measuring the Stokes parameters of circularly polarized spherical illumination. However, this technique may require four measurements per spherical lighting condition with a set of different linear and circular polarizers in front of the camera in order to compute the Stokes parameters and hence may not scale well for multiview acquisition of live subjects.