Ray tracing systems can simulate the manner in which rays (e.g. rays of light) interact with a scene. For example, ray tracing techniques can be used in graphics rendering systems which are configured to produce images from 3-D scene descriptions. The images can be photorealistic, or achieve other objectives. For example, animated movies can be produced using 3-D rendering techniques. The description of a 3D scene typically comprises data defining geometry in the scene. This geometry data is typically defined in terms of primitives, which are often triangular primitives, but can sometimes be other shapes such as other polygons, lines or points.
Ray tracing can mimic the natural interaction of light with objects in a scene, and sophisticated rendering features can naturally arise from ray tracing a 3-D scene. Ray tracing can be parallelized relatively easily on a pixel by pixel level because pixels generally are independent of each other. However, it is difficult to pipeline the processing involved in ray tracing because of the distributed and disparate positions and directions of travel of the rays in the 3-D scene, in situations such as ambient occlusion, reflections, caustics, and so on. Ray tracing allows for realistic images to be rendered but often requires high levels of processing power and large working memories, such that ray tracing can be difficult to implement for rendering images in real-time (e.g. for use with gaming applications), particularly on devices which may have tight constraints on silicon area, cost and power consumption, such as on mobile devices (e.g. smart phones, tablets, laptops, etc.).
At a very broad level, ray tracing involves: (i) identifying intersections between rays and geometry (e.g. primitives) in the scene, and (ii) performing some processing (e.g. by executing a shader program) in response to identifying an intersection to determine how the intersection contributes to the image being rendered. The execution of a shader program may cause further rays to be emitted into the scene. These further rays may be referred to as “secondary rays”, and may include occlusion rays for determining shadow effects, or reflection rays for determining reflections in the scene. Rays are traced from an origin and intersections of the rays with geometry in the scene can be determined. FIG. 1 shows an example of a scene 102 which includes two surfaces 1041 and 1042. This is a very simple example, and in other examples there would likely be many more surfaces and objects within the scene. FIG. 1 shows two light sources 1061 and 1062 which illuminate objects in the scene. The viewpoint from which the scene is viewed is shown at 108 and the view plane of the frame to be rendered is represented at 110.
Identifying intersections between rays and geometry in the scene involves a large processing workload. In a very naïve approach, every ray could be tested against every primitive in a scene and then when all of the intersection hits have been determined, the closest of the intersections could be identified for each ray. This approach is not feasible to implement for scenes which may have millions or billions of primitives, where the number of rays to be processed may also be millions. So, ray tracing systems typically use an acceleration structure which characterises the geometry in the scene in a manner which can reduce the workload of intersection testing. Acceleration structures can have a hierarchical structure such that there are multiple levels of nodes within an acceleration structure. The term “hierarchy” may be used herein to refer to a hierarchical acceleration structure. To give some examples, a hierarchical acceleration structure may have a grid structure, an octree structure, a space partitioning structure (e.g. a k-d tree), or a bounding volume structure.
An octree structure (which is an example of a spatial subdivision structure) recursively subdivides 3D space by halving a node in each of three spatial directions (e.g. along x, y and z axes) thereby subdividing a node into eight equal regions, which are represented as child nodes in the hierarchy. FIG. 2a represents a corresponding two dimensional example (i.e. a quadtree) in which a node is halved in both x and y directions, depending on the complexity of the content (e.g. the number of primitives) within the nodes. FIG. 2a illustrates a scene 200 which includes three objects 202, 204 and 206. FIG. 2b represents the nodes of the hierarchical acceleration structure representing the regions shown in FIG. 2a. The acceleration structure shown in FIGS. 2a and 2b has a top level node 210 which covers the whole scene 200. The node 210 is subdivided into four quarters, represented by the nodes 2121 to 2124. The node 2121 represents the top left quarter of the node 210 and is not further subdivided. The node 2121 includes a reference to the object 204. The node 2122 represents the top right quarter of the node 210 and is not further subdivided. The node 2122 includes a reference to the object 202. The node 2124 represents the bottom right quarter of the node 210 and is empty and not further subdivided. The node 2123 represents the bottom left quarter of the node 210 which covers both of the objects 204 and 206. Node 2123 is subdivided into four quarters 2141 to 2144. The node 2141 represents the top left quarter of the node 2123 and is not further subdivided. The node 2141 includes references to the objects 204 and 206. The node 2142 represents the top right quarter of the node 2123 and is empty and not further subdivided. The node 2143 represents the bottom left quarter of the node 2123 and is not further subdivided. The node 2143 includes a reference to the object 206. The node 2144 represents the bottom right quarter of the node 2123 and is not further subdivided. The node 2144 includes a reference to the object 206.
The empty nodes (e.g. 2124 and 2142) can either be excluded entirely from the hierarchy or they can be included in the hierarchy but marked as “empty” so that no intersection testing is performed on the empty nodes. The encoding format determines which of these two options is more suitable. In both cases, conceptually, the empty nodes can be considered to be excluded because the traversal of the hierarchy during intersection testing will not include testing of the empty nodes.
Spatial subdivision structures (e.g. the octree structure of FIGS. 2a and 2b) divide the space of a scene into regions and form nodes of a hierarchical acceleration structure to represent those regions of the scene. In contrast, a bounding volume structure has nodes corresponding to volumetric elements which are positioned based on the content of the scene. FIGS. 3a and 3b relate to a hierarchy having a bounding volume structure. FIG. 3a illustrates a scene 300 which includes three objects 302, 304 and 306. FIG. 3b shows nodes of a hierarchical acceleration structure wherein the root node 310 represents the whole scene 300. Regions in the scene shown in FIG. 3a have references matching those of the corresponding nodes in the hierarchy shown in FIG. 3b, but the references for the regions in FIG. 3a include an additional prime symbol (′). The objects in the scene are analysed in order to build the hierarchy, and two nodes 3121 and 3122 are defined descended from the node 310 which bound regions containing objects. In this example, the nodes in the bounding volume hierarchy represent axis-aligned bounding boxes (AABBs) but in other examples the nodes could represent regions which take other forms, e.g. spheres or other simple shapes. The node 3121 represents a box 3121′ which covers the objects 304 and 306. The node 3122 represents a box 3122′ which covers the object 302. The node 3121 is subdivided into two nodes 3141 and 3142 which represent AABBs (3141′ and 3142′) which respectively bound the objects 304 and 306. Methods for determining the AABBs for building nodes of a hierarchy are known in the art, and may be performed in a top-down manner (e.g. starting at the root node and working down the hierarchy), or may be performed in a bottom-up manner (e.g. starting at the leaf nodes and working up the hierarchy). In the example shown in FIGS. 3a and 3b, objects do not span more than one leaf node, but in other examples objects may span more than one leaf node.
When traversing a hierarchical acceleration structure for intersection testing of a ray in a scene, the ray is initially tested against the root node. If an intersection is found between the ray and a node then the ray may be tested for intersection with one or more nodes which are children of the intersected node. There are a number of different traversal techniques which can be used to traverse a hierarchical acceleration structure, such as a depth-first traversal technique and a breadth-first traversal technique. In a depth-first traversal technique a subset of the children of an intersected node (e.g. a single child of the intersected node) may be tested for intersection before optionally testing other children of the intersected node for intersection, depending on the results of the previous intersection testing. In contrast, according to a breadth-first traversal technique, if an intersection is found between a ray and a node then the ray may be tested for intersection with all of the nodes which are children of the intersected node prior to performing the intersection testing for any of those children.
With current state of the art acceleration structures it is difficult to perform the processes involved in ray tracing (e.g. build an acceleration structure, perform intersection testing and execute shader programs in dependence on the results of the intersection testing) at a rate that is suitable for rendering images in real-time (e.g. for use with gaming applications), particularly on devices which have tight constraints on silicon area, cost and power consumption, such as on mobile devices (e.g. smart phones, tablets, laptops, etc.). Furthermore, the acceleration structures need to be stored, e.g. in a memory, and this can involve storing lots of data, which increases the memory requirements of the system and also means that large amounts of data are transferred between a memory and a chip on which a ray tracing system is implemented. The transfer of large amounts of data (i.e. a high memory bandwidth) typically corresponds to high latency and power consumption of the ray tracing system.