Three-dimensional vision is an important requirement for a variety of interesting machine vision applications; for example self-driving cars, autonomous robots, augmented reality devices, entertainment systems, gesture recognition, face tracking or 3D modelling.
Ranging devices such as lidars or time-of-flight cameras require sub-nanosecond resolution to measure the time an emitted light pulse travels to a surface and back. These kind of measurements demand expensive setups either involving moving parts (lidar) or very complex and big pixel circuits (time-of-flight).
Passive vision systems, such as stereo vision or structure-from-motion overcome these limitations but require substantial computational resources and are only functional in environments with sufficient lighting and spatial contrast.
Active vision systems, based on structured lighting on the other hand, combine the advantages of an active light source with the simple data acquisition of a vision system.
In Active vision systems depth from structured lighting is obtained in the following way: A well-known pattern is projected on to a scene. The reflections of the pattern are captured by a camera which is mounted with a fixed baseline distance to the projector. Geometrical constraints (epipolar geometry) and the captured position of a projected pattern feature allow inferring the depth of the underlying surface. In order to obtain dense depth maps, many small projected features are required. To identify these features they should either be unique such as in the case of random dot patterns (e.g. Microsoft's Kinect) or multiplexed in time (e.g. Intel's Realsense or laser line scanners). However disadvantageously, the pattern of unique features limit the spatial resolution and require computationally expensive matching algorithms, and time-multiplexed patterns are constrained by the temporal resolution of the sensor and can suffer from motion artefacts if the temporal resolution of the sensor is not sufficiently large compared to the motion captured in the scene.
It is an aim of the present invention to mitigate at least some of the above-mentioned disadvantages.