CPC G06T 17/205 (2013.01) [G06T 7/70 (2017.01); G06T 9/00 (2013.01); G06V 10/761 (2022.01); G06V 10/82 (2022.01); G06V 40/174 (2022.01); G06T 2207/30201 (2013.01)] | 11 Claims |
1. A method for three-dimensional (3D)-reconstruction of a human head for rendering a human image, the method being performed by a device including at least one processor and at least one memory, the method comprising:
a) encoding, by using a first convolutional neural network, a single source image into a neural texture, the neural texture having a same spatial size as the single source image and a larger number of channels than the single source image, the neural texture containing local person-specific details;
b) estimating, by a pre-trained detailed expression capture and animation (DECA) system, a face shape, a facial expression, and a head pose by using the single source image and a target image, and providing an initial mesh as a set of faces and a set of initial vertices based on a result of the estimating;
c) providing a predicted mesh of a head mesh based on the initial mesh and the neural texture; and
d) rasterizing 3D reconstruction of a human head based on the predicted mesh, and rendering a human image based on a result of the rasterizing,
wherein the providing the predicted mesh comprises:
rendering the initial mesh into an xyz-coordinate texture;
concatenating the xyz-coordinate texture and the neural texture;
processing, by using a second neural network, a result of the concatenating into a latent geometry map;
bilinear sampling the latent geometry map by using texture coordinates to obtain a vertex-specific feature;
decoding the vertex-specific feature by a multi-layer perceptron for predicting a 3D offset for each vertex; and
adding the predicted 3D offset to the initial vertices to obtain the predicted mesh.
|