This disclosure generally relates to image processing, and more specifically to computing high-resolution depth images using a machine learning model trained with machine learning techniques on images of color information at high resolutions and corresponding images with depth information at relatively low resolutions.
Cameras capture images and videos of a scene using various sensors. As an example, a camera sensor, such as a color sensor, can be specifically configured to capture two-dimensional color information regarding the scene. Alternatively, a camera sensor, such as a depth sensor, can be configured to capture depth information regarding the scene. Depending on the camera configuration, camera sensors in a single camera may capture images at different resolutions. Specifically, conventional cameras employ depth sensors that capture an image with a resolution that is far below the resolution of an image captured by a color sensor. Constructing high-resolution depth images using existing methods by combining color images in high resolutions and corresponding depth images at relatively lower resolutions is problematic. However, higher resolution depth sensors are cost prohibitive, thereby precluding their inclusion in conventional cameras.
Instead of employing a high resolution depth sensor, conventional systems employ image processing techniques that upsample a depth image at a lower resolution to match the resolution of a corresponding color image. One example of a conventional image processing technique is pixel interpolation. As an example, interpolated pixels can be generated from two or more existing pixels. For example, if a first existing pixel has a first value and a second existing pixel has a different second value (i.e. a depth discontinuity), then the interpolated pixels between the first and second existing pixels may be assigned values that smooth the transition between the first existing pixel and the second existing pixel. Altogether, these conventional image processing techniques remain sub-optimal and fail to accurately upsample the depth image. Therefore, the subsequent processing of an upsampled depth image and corresponding color images also suffer due to the non-optimally upsampled depth images.