Multi-view stereo imaging refers to obtaining, generating and transmitting a scene based on primary principle of multi-view stereo vision and presenting the scene with stereoscopic vision. Multi-view stereo imaging is widely used in the fields of virtual reality, machine vision, multi-media teaching, digital entertainment, product appearance design, carving and architecture or the like. The primary process of multi-view stereo imaging is to capture images of a same object by multiple cameras at different positions at the same time, match the pixels in the obtained multiple images, and finally obtain the depth information (i.e., 3D coordinates) of the individual pixels based on, for example, triangulation principle, thereby obtaining a 3D stereo image.
FIG. 1 exemplarily explains the triangulation principle. As shown in FIG. 1, two cameras at different positions capture images of a same object, respectively. Thereafter, with respect to a point seen from one camera (i.e., a pixel in the image captured by the camera), a corresponding point seen from the other camera (i.e., a matching pixel in the image captured by the other camera) is found. An intersection point of extending lines of two lines respectively connecting the two points and their respective cameras (simply referred to as 3D radial lines) is derived, and thus the coordinates of the point to be measured can be obtained. Similarly, when capturing by three or more cameras, with respect to a point seen from one camera, corresponding points seen from other cameras can be found, respectively, and the intersection points between each two of the 3D radial lines are derived, thereby obtaining the coordinates of the point to be measured.
An important problem with the multi-view stereo imaging is non-diffuse reflection phenomenon, which is also called specular highlight phenomenon. When there is specular highlight on the surface of the object, since the object image of the specular highlight region is affected by normal direction of the object surface and positions of the light source and cameras, a systematic offset may be arisen in pixel matching, which causes the shape information of the captured object to be distorted. Such a problem is especially serious in the case that the specular highlight occurs in protuberant positions of the object surface.
FIG. 2 is a schematic view showing the distortion phenomenon occurred in the shape information of the captured object due to the specular highlight. As shown in FIG. 2, it is assumed that with respect to one pixel located at the protrusion of the object in the image captured by camera 1, it is attempted to find matching pixels in the images captured by cameras 2 and 3, respectively. Since the normal direction on the surface of the protrusion of the object monotonously changes rapidly, due to the interference of the specular highlight, the position of the matching pixel found in the image captured by camera 2 shifts to the right with respect to the actual correct position, and the position of the matching pixel found in the image captured by camera 3 shifts to the right even further with respect to the actual correct position. When the degrees of shifting to right for the matching pixels found for cameras 2 and 3 are just fit, it is possible that the 3D radial lines corresponding to respective cameras just intersect at a same point (3D point) during the triangulation, which may lead to an misunderstanding of finding a reconstruction 3D point. Actually, however, the position of this 3D point is behind the actual protrusion position, rendering the place that should have been convex becomes concave in the output reconstruction result.
With respect to the specular highlight problem as described above, researches have been made in the art, and hardware alter-based or software process-based methods have been proposed. The hardware alter-based method, for example, is to add a polarization plate before the camera and the light source, which increases the hardware complexity of the system and reduces the efficiency of the light source. The software process-based method, for example, is to determine whether the grey level value of the pixels are too bright, however, it is not robust enough and can not deal with the situation where the specular highlight generated by the skin or the like is not that bright, for example.