Ray tracing is a technique for computing visibility between points. Light transport algorithms simulate the way light-rays propagate through space (while interacting with objects), attaining the resulting colors for the screen pixels. Primary rays are shot from the screen to the scene intersecting with scene objects and generating secondary rays that bounce all over the scene according to the physical laws of optics. (Note: unless specifically stated otherwise, the general term secondary rays used herein throughout the specification discussions, refers to secondary, ternary, and higher degree rays).
Ray tracing is capable of producing a very high degree of visual realism, higher than that of typical raster methods, but at a greater computational cost. Ray tracing is superior to raster graphics by its capability to simulate a wide variety of optical effects, such as glossiness, specularity, radiosity, reflection and refraction, scattering, soft shadows and more. The prior art ray tracing is one of the most computationally complex applications. As such, it is best suited for applications where the image can be rendered slowly ahead of time, such as in still images and film and television visual effects, and is poorly suited for real-time animated application of augmented reality where the real time animation is critical.
Path tracing. The traditional ray tracing is not necessarily photorealistic. True photorealism occurs when the rendering equation is closely approximated or fully implemented. Implementing the rendering equation gives true photorealism, as the equation describes every physical effect of light flow. However, this depends on the available computing resources. Path tracing, referred to as a Monte Carlo ray tracing gives far more accurate simulation of real-world lighting. Traditional path tracers [Kajiya, J. T. 1986. The rendering equation. In Proc. SIGGRAPH] shoot rays through each pixel, stochastically scattering according to the profile of the intersected object's reflectance and continuing recursively until striking a light source. Repeated sampling for any given pixel in the image space will eventually cause the average of the samples to converge to the correct solution of a rendering equation, making it one of the most physically accurate 3D graphic rendering methods in existence. Path tracing can generate images that are indistinguishable from photographs. Its visual quality is higher than that of a simple ray tracing, but at a much greater computational cost. (Note: unless specifically stated otherwise, the general term ray tracing, used herein throughout the specification discussions, refers to path tracing.)
A path tracer continuously samples pixels of the screen space. Rays are distributed randomly within each pixel in screen space and at each intersection with an object in the scene a new reflection ray, pointing in a random direction, is generated. After some number of bounces each ray eventually exits the scene or is absorbed.
Path tracing is a global illumination technique. Global illumination takes into account not only the light that comes directly from a light source, but also light reflected by surfaces in the scene, whether specular, diffuse, or semi-reflective. FIG. 1 depicts the sampling of diffuse inter-reflection from the surrounding environment, at a given surface point. In order to achieve global illumination on a diffuse surface, sampling rays must be shot from a hit point (HIP) 10. HIP is a result of a previous encounter between a ray (primary or secondary) and a triangle. The sampling at the HIP is done by shooting many rays, each in a random direction, within the boundaries of a hemisphere 11. The hemisphere is oriented such that its north pole is aligned with the surface normal N.
The relation between the deviation of a sampling ray from the normal N in the hemisphere, and its contribution to the aggregated light energy, is shown in FIG. 2. It is strongly connected with the BRDF (bidirectional reflectance distribution function) of the surface material. BRDF is a function of four real variables that defines how light is reflected at an opaque surface. According to the Monte Carlo technique each of the hemisphere rays is shot from the same HIP but at random direction, bounded by the hemisphere of FIG. 1. As a result, ray's sampled data contributes to the aggregated light energy at the HIP, according to the BRDF function.
Hybrid ray tracing (ray tracing interlaced with raster rendering) is a deferred rendering process based on raster rendering to calculate the primary rays collision, while the secondary rays use a ray tracing approach to obtain shadow, reflection and refraction effects. This approach vastly improves ray tracing performance, not only because many unnecessary traditional ray tracing tasks are avoided, but also because a complete image is available in a demanded time, even if there is not enough time to finish calculations of all the visual effects.
The concept of a hybrid Real-Time Raster and Ray Tracer renderer is not new. Beck et al [Beck et al Beck, S., c. Bernstein, A., Danch, D., Frohlich, B.: CPU-GPU hybrid real time ray tracing framework (2005)] proposes a CPU-GPU Real-Time Ray-Tracing Framework. Beck proposal spread the traditional stages of ray tracing in independent tasks for the GPU and CPU. These render tasks can be summarized into three GPU render passes: a shadow map generation pass, a geometry identification pass and a blur pass.
Bikker [Bikker, J.: Real-time ray tracing through the eyes of a game developer. In: Proceedings of the 2007 IEEE Symposium on Interactive Ray Tracing, Washington, DC, USA, IEEE Computer Society (2007)] developed a Real-Time Path Tracer called Brigade, which divides the rendering task seamlessly over both GPU and CPU available cores. Brigade aims the production of proof-of-concept games that use path tracing as the primary rendering algorithm.
Pawel Bak [Bak, P.: Real time ray tracing. Master's thesis, IMM, DTU (2010)] implements a Real-Time Ray Tracer using DirectX 11 and HLSL. Similar to Beck's work, his approach also uses rasterization in order to achieve the best possible performance for primary hits.
Chen [Chen, C. C., Liu, D. S. M.: Use of hardware z-buffered rasterization to accelerate ray tracing. In: Proceedings of the 2007 ACM symposium on Applied computing. SAC '07, New York, N.Y., USA, ACM (2007) 1046-1050] presented a hybrid GPU/CPU ray tracer renderer, where a Zbuffered rasterization is performed to determine the visible triangles at the same time that primary ray intersections are determined. The CPU reads the data back in order to trace secondary rays.
Sabino et al. [Thaler Sabino, Paulo Andrade, Esteban Gonzales Clua, Anselmo Montenegro, Paulo Pagliosa, A Hybrid GPU Rasterized and Ray Traced Rendering Pipeline for Real Time Rendering of Per Pixel Effects, Univ. Federal Fluminense, Rio de Janeiro, Brazil, 2013] present a heuristic approach that select a subset of relevant objects to be ray traced, avoiding traversing rays for objects that might not have a significant contribution to the real time experience.
An important strategy in real-time hybrid ray tracing, is the use of GPU for raster techniques to improve the performance and a smart strategy for prioritizing regions and objects that will receive the ray tracing light effects. NVIDIA's OptiX [Parker, S. G., Bigler, J., Dietrich, A., Friedrich, H., Hoberock, J., Luebke, D., McAllister, D., McGuire, M., Morley, K., Robison, A., Stich, M.: Optix: A general purpose ray tracing engine. ACM Transactions on Graphics (Aug. 2010)] is a general-purpose ray tracing engine targeting both NVIDIA's GPUs and general-purpose hardware in the current version. OptiX architecture offers a low-level ray tracing engine, a programmable ray tracing pipeline with a shader language based on CUDA C/C++, a domain-specific compiler and a scene-graph based representation. OptiX is a GPU only solution with remarkably good results for interactive ray tracing. Recently Nvidia OptiX has got a support by Nvidia RTX, which is a development platform for hybrid ray tracing, for a special purpose hardware. It runs on Nvidia Volta- and Turing-based GPUs, specifically utilizing an architecture for ray tracing acceleration.
Despite all the hybrid ray tracing developments, hybrid real-time ray tracers on low-power devices do not exist in prior art. Their applicability on low-power devices, such as laptops, tablets, hand-held mobiles, becomes more and more relevant. The likelihood of running ray tracing on low power devices is forecasted only to the thirties: “By Moore's law alone by 2032 we could be running real time ray tracing on mobile phones.”Jon Peddie, “Peddie predicts we could have real time ray tracing on our PCs in less than 6 years”, TechWatch, 27 Mar. 2018.
The hurdles of path tracing that prevent a real-time performance in prior art, either of a full-blown ray tracing or of a hybrid ray tracing are: traversal and frequent reconstruction of accelerating structures, lack of coherence of secondary rays, and a noise causing a “film-grain” appearance of images.
Real-time Ray Tracing (RTRT). Historically, ray tracing had been reserved to off-line applications, such as computer-generated photo-realistic animated films. Real-time applications of video games, virtual and augmented reality had have to rely on rasterization for their rendering. RTRT is a hard-computational task, not only because each pixel in the image must be calculated separately, but also because the final color of a single pixel can be affected by more than one recursive ray. Another consideration is that ray tracing algorithms waste from 75% to 95% of its execution time calculating intersection points between rays and objects. RTRT has been enabled by Nvidia's RTX in 2018 (Alwani, Rishi. “Microsoft and Nvidia Tech to Bring Photorealistic Games With Ray Tracing”. Gadgets 360. Retrieved Mar. 21, 2018), facilitating a new development in computer graphics of generating interactive images that react to lighting, shadows, reflections by special purpose hardware. Nvidia's RTX is based on traditional ray tracing algorithm accelerated by an on-chip supercomputing hardware of closely 5000 cores. It comprises of a GPU having 4352 cores, AI denoiser utilizing 544 cores, and intersection tests accelerator of 68 cores. The power requirement of a single RTX2080 GPU is 250 W, and the price starts at €418. Due to the high cost and high power of RTX it is targeted at the high-end video games.
However, there is a great need to enable real-time ray tracing on consumer class devices (smartphones, tablets, laptops and PC), for video games, VR and AR, democratizing ray tracing to the massive audience which includes limited powered devices. For this end a new and novel method, based on radical algorithmic improvements is needed.
Reflections. In prior art's hybrid ray tracing the reflections are generated based on G-buffer (Luis Sabino et al., A Hybrid GPU Rasterized and Ray Traced Rendering Pipeline for Real Time Rendering of Per Pixel Effects, 2013). The G-Buffer is generated during the first stage by raster rendering, a “differed shading” stage. The basic idea behind deferred shading is to perform all visibility tests before performing any lighting computations. Therefore, at first, visibility tests are done by raster rendering, while shading is differed to a later stage, combined with ray tracing. The G-buffer produced by the deferred shading stage contains information about optical properties of the underlying material of each pixel. Its contents are used to determine the need for tracing reflection/refraction rays. It is composed by reflectivity, index of refraction, specular exponent and opacity, respectively. The rays need to be traced from the surfaces only through the scene. This way enables to avoid trace of unnecessary rays in places where the material is neither refractive nor reflective. After differed shading is done, the ray tracing algorithm starts with secondary rays and can follow its own path. Any secondary ray generated will be traced against scene in order to produce global illumination effects, such as reflections and refractions. The result of this stage can be understood as the generation of a ray trace effects layer. This effects layer will be blended to the image already generated, in order to improve its visual quality with global illumination effects.
According to the G-buffer method the secondary rays are a natural extension of primary rays. Ray tracing that is carried-on by the chosen secondary rays suffer from the same difficulties of conventional ray tracing: lack of coherence of secondary rays and images with stochastic noise.
Lack of ray coherence of secondary rays. Coherence of rays is the key to efficient parallelization of ray tracing. In prior art ray tracing the primary and shadow rays are coherent. This coherence is exploited for efficient parallel processing: traversing, intersecting, and shading by packets of coherent rays. They work well for nearby primary rays, since these rays often traverse similar parts of the accelerating data structure. Using this approach, we can reduce the compute time by using SIMD instructions on multiple rays in parallel, reduce memory bandwidth by requesting data only once per packet, and increase cache utilization at the same time. This works fine for primary rays that originate from the camera. Unfortunately, it is not possible to use ray packets effectively with rays of an advanced order (secondary, ternary, etc.). The primary reason is that advanced order rays bounce in different direction losing coherence. Moreover, in path tracing there is an intentional randomization of rays for diffuse reflections. Reorganizing secondary rays to form bundles with higher coherence ratios, are practiced by the prior art. But this kind of regrouping is a quite expensive operation since it involves a scatter/gather step, which may result in only a slight frame rate improvement when reordering is applied.
Sadegi et al. [Iman Sadeghi, Bin Chen, and Henrik Wann, Coherent Path Tracing, Jensen University of California, San Diego, 2009], developed a technique for improving the coherency of secondary rays. This technique uses the same sequence of random numbers for generating secondary rays for all the pixels in each sample. This improves the efficiency of the packet tracing algorithm but creates structured noise patterns in the image.
Noisy images. A path tracer continuously samples pixels of the screen space. The image starts to become recognizable after only a multiple samples per pixel. Rays are distributed randomly within each pixel in screen space and at each intersection with an object in the scene a new reflection ray, pointing in a random direction, is generated. After some number of bounces each ray eventually exits the scene or is absorbed. When a ray has finished bouncing about in the scene a sample value is calculated based on the objects the ray bounced against. The sample value is added to the average for the source pixel.
The random components in ray tracing cause the rendered image to appear noisy. The noise decreases over time as more and more samples are calculated. The defining factor for render quality is the number of samples per pixel (SPP). The higher SPP you have in a rendered image the less noise will be noticeable. However, the added quality per sample decreases the more samples you have already (since each sample is just contributing to an average over all samples). The difference in image quality between, for example, 20,000 SSP and 21,000 SSP will not be as noticeable as between 1,000 SSP and 2,000 SSP.
The initial high screen resolution transforms to a low spatial resolution that decreases quickly as the rays are progressing deeper into the space. Due to the low spatial resolution each one of the produced frames is noisy. Only converge of many subsequent frames reduces the final image noise. The image to converge and reduce noise to acceptable levels usually takes around 5000 samples for most path traced images, and many more for pathological cases. Noise is particularly a problem for animations, giving them a normally unwanted “film-grain” quality of random speckling.
Accelerating structures. The most time-consuming tasks in ray tracing are intersection tests between millions of rays and millions of polygons. They are partly relieved by use of acceleration structures (AS) which are huge binary trees. Every single ray is traversed across an accelerating structure (e.g. K-trees or BVH trees), seeking polygons for intersection. These traversals become a major time-consuming task—they typically take over 70% of the image generation time. The polygons near to the path of the traversing ray are subject to intersection test.
The AS based on binary trees are basically static. They are constructed in a preprocessing step. Such a step takes typically much more time than rendering an image. The construction time depends on the scene size. Every major modification of the scene necessitates a reconstruction of the static acceleration structures. Moreover, the memory size is typically doubled by these structures.
There are two major drawbacks associated with the use of static acceleration structures; (i) traversals of these structures are time-consuming, challenging the real-time requirements, and (ii) they must be repeatedly reconstructed upon scene changes, which contradicts with real time animation. Reconstructing static acceleration structure is a computationally intensive task, preventing real-time animation.