Vision is a most direct and principal approach for mankind to observe and cognize the world. In a three-dimensional world we live, human vision can not only perceive luminance, color, texture information, and motion condition of a surface of an object, but also can determine its shape, space, and spatial position (depth, distance). How to enable a machine vision to real-time obtain highly precise depth information and enhance an intelligence level of a machine is a challenge for current machine vision system development. A 3D depth perception device, as a novel stereoscopic visual sensor, may obtain high-precision and high-resolution depth map information (distance information), perform real-time recognition of a three-dimensional image, capture motions, and perceive a scene. Currently, “the virtual world is infinitely closer to the real world; the human-machine interaction mode will become more natural, intuitive, and immersive.” As a “portal device” for interaction between the real physical world and the virtual network world, the 3D depth perception device (RGB+Depth) will likely replace a traditional RGB camera in the near future and become a ubiquitous important device in the real world, such that a machine or an intelligent device has a 3D visual perception competence like human eyes. This facilitates natural interaction between man and machine, virtual interaction between man and the web world, and even interaction between machine and machine. Now, with the in-depth development of industries such as unmanned aerial vehicles, 3D printing, robots, virtual reality helmets, smart mobiles, intelligent households, human-face recognition payment, intelligent monitoring and the like, problems such as environment perception, human-machine natural interaction, obstacle avoidance, 3D scanning, accurate recognition and the like need to be solved. The 3D depth perception sensor processor technology, as a key generic technology, facilitates tackling these problems. It will greatly release and inspire people's scientific imagination and creativity in relevant study fields.
A structured light encoding-based three-dimensional depth technology can obtain depth information more accurately. Compared with binocular stereoscopic cameras and a ToF (Time of Flight) manner, it has advantages that the obtained depth map information is more stable and reliable, less affected by ambient light, and has a simple stereoscopic matching algorithm. As a depth perception technology that is highly cost-effective, highly reliable, and highly working range adaptable, it will become a dominant technology for human-machine interaction and intelligent devices to acquire depth.
In the prior art, a monocular mode has an advantage of a simple structure, which can acquire depth with one receiving camera, such that it is applicable to a small volume application scenario. A binocular mode has an advantage of acquiring a better depth map detail, i.e., the depth information has a higher resolution and a higher depth precision; meanwhile, it has a wider application scope, including an outdoor scenario.