In the recent decades, various depth cameras have been developed to represent the physical world in a three-dimensional (3D) fashion, such as time-of-light (TOF) cameras, stereo cameras, laser scanners, and structured light cameras. These depth cameras are not as popular as two-dimensional (2D) red-green-blue (RGB) cameras due to their high costs and enormous computing requirements.
The depth cameras each aim to measure the distance from the camera to a target object by utilizing the light wave properties, but their working principles vary. For example, the TOF camera measures the depth by detecting the light wave phase shift after reflection, while the stereo camera generates a disparity map by stereo matching. Depth generated by these different devices exhibits different data characteristics.
In another example, in the structured light camera used by the Kinect® gaming device, depth is derived from the disparity between the projected infrared light pattern and the received infrared light. The granularity and the stability of the received light speckles directly determine the resolution and the quality of the depth data. The captured depth sequence is characterized by its large variation in range and instability. Similar to the depth derived from the stereo video, the Kinect® depth suffers from the problems of depth holes and boundary mismatching due to the deficiency of the received light speckles. Moreover, even if the light speckles have been received by the sensor, the generated depth sequence is unstable in the temporal domain due to the variation of the received light. Depth data is likely to change from time to time, even when representing a static scene. While filtering can be used to improve depth sequences that are unstable in the temporal domain, the depth holes found in these depth images and the error associated with depth measurements often frustrates successful filtering.
In addition, compression of depth data generated by a depth camera, such as the structured light camera used by the Kinect® gaming device, is problematic. The size of the depth data imposes significant transmission and storage costs. While image/video compression methods can, in the abstract, be used for depth data compression, the noise and instability of the depth data associated with the depth images makes actual use of such image/video compression methods problematic.