Augmented Reality (AR) relates to technology that provides a composite view of a real-world environment together with a virtual-world environment (e.g., computer generated input). Correct perception of depth is often needed to deliver a realistic and seamless AR experience. For example, in AR-assisted maintenance or manufacturing tasks, the user tends to interact frequently with both real and virtual objects. However, without correct depth perception, it is difficult to provide a seamless interaction experience with the appropriate occlusion handling between the real-world scene and the virtual-world scene.
In general, real-time 3D sensing is computationally expensive and requires high-end sensors. To reduce this overhead, some early work relies on 2D contour tracking to infer an occlusion relationship, which is typically assumed to be fixed. Alternatively, some other work includes building 3D models of the scene offline and using these 3D models online for depth testing, assuming the scene is static and remains unchanged. Although these methods can achieve some occlusion handling effects, they cannot accommodate the dynamic nature of user interactions which are very common in AR applications.
Also, the recent arrival of lightweight RGB-Depth (RGB-D) cameras provide some 3D sensing capabilities for AR applications. However, these RGB-D cameras typically have low cost consumer depth sensors, which usually suffer from various types of noises, especially around object boundaries. Such limitations typically cause unsuitable visual artifacts when these lightweight RGB-D cameras are used for AR applications, thereby prohibiting decent AR experiences. Plenty of research has been done for depth map enhancement to improve the quality of sensor data provided by these lightweight RGB-D cameras. However, the majority of these approaches cannot be directly applied to AR use cases due to their high computational cost.
In addition, filtering is often used for image enhancement. For instance, some examples include a joint bilateral filtering process or a guided image filtering process. Also, other examples include a domain transform process, an adaptive manifolds process, or an inpainting process. However, these processes are typically computationally expensive and often result in edge blurring, thereby causing interpolation artifacts around boundaries.