A natural and harmonious human-machine interaction manner is an ideal objective of human beings in manipulating machines, which enables a machine to understand a command transmitted by people in a natural state. A depth perception technology, as a core technology for human-machine natural interaction, has a wide application prospect in fields such as machine vision, intelligent monitoring, 3D rebuilding, somatosensory interaction, 3D printing, unmanned aerial vehicles, etc. A structured light-based active visual mode may obtain depth information of an image in a relatively accurate manner, e.g., projecting, with infrared laser, images of a fixed mode onto a surface of an object so as to encode the surface, collecting, by an image sensor, infrared encoded images, and then calculating depth information of the object through depth perception. The generated depth information may be used for real-time identifying a three-dimensional image and capturing actions, so as to make it possible for people to interact with a terminal through natural manners like expressions, gestures, and somatosensory actions. Compared with ToF (Time of Flight), the structured-light encoded three-dimensional depth perception technology has certain advantages in cost and performance.
The existing three-dimensional depth perception devices, for example, the first generation (based on PrimeSense structured-light module) and second generation (based on ToF module) of Microsoft Kinect, have a working range between about 0.6 and 5 meters, mainly for somatosensory action identification with a certain distance, e.g., home entertainment; the RealSense 3D depth camera by Intel has a working range between about 0.2˜2 meters, for near-range identification of gesture actions and human-face expressions. These existing three-dimensional depth perception apparatuses cannot realize an adjustable working range, i.e., with a same set of apparatus, not only a near-range gesture or human face can be identified, but also somatosensory actions at a distance of several meters away can be identified, and even pedestrians at a farther distance (e.g., 10 meters away) can be tracked and identified.