Computer graphics processing is an intricate process used to create images that depict virtual content for presentation on a display. Modern 3D graphics are often processed using highly capable graphics processing units (GPU) having specialized architectures designed to be efficient at manipulating computer graphics. The GPU is a specialized electronic circuit designed to accelerate the creation of images in a frame buffer intended for output to a display, and GPUs often have a highly parallel processing architecture that makes the GPU more effective than a general-purpose CPU for algorithms where processing of large blocks of data is done in parallel. GPUs are used in a variety of computing systems, such as embedded systems, mobile phones, personal computers, tablet computers, portable game devices, workstations, and game consoles.
Many modern computer graphics processes for video games and other real-time applications utilize a rendering pipeline that includes many different stages to perform operations on input data that determine the final array of pixel values that will be presented on the display. In some implementations of a graphics rendering pipeline, processing may be coordinated between a CPU and a GPU. Input data may be setup and drawing commands may be issued by the central processing unit (CPU) based on the current state of an application (e.g., a video game run by the CPU) through a series of draw calls issued to the GPU through an application programming interface (API), which may occur many times per graphics frame, and the GPU may implement various stages of the pipeline in response in order to render the images accordingly.
Most stages of the pipeline have well defined inputs and outputs as data flows through the various processing stages, and any particular implementation may include or omit various stages depending on the desired visual effects. Sometimes various fixed function operations within the graphics pipeline are implemented as hardware modules within the GPU, while programmable shaders typically perform the majority of shading computations that determine color, lighting, texture coordinates, and other visual values associated with the objects and pixels in the image, although it is possible to implement various stages of the pipeline in hardware, software, or a combination thereof. Older GPUs used a predominantly fixed function pipeline with computations fixed into individual hardware modules of the GPUs, but the emergence of shaders and an increasingly programmable pipeline have caused more operations to be implemented by software programs, providing developers with more flexibility and greater control over the rendering process.
Generally speaking, early stages in the pipeline include computations that are performed on geometry in virtual space (sometimes referred to herein as “scene space”), which may be a representation of a two-dimensional or, far more commonly, a three-dimensional virtual world. The objects in the virtual space are typically represented as a polygon mesh set up as input to the early stages of the pipeline, and whose vertices correspond to the set of primitives in the image, which are typically triangles but may also include points, lines, and other polygonal shapes. The vertices of each primitive may be defined by a set of parameter values, including position values (e.g., X-Y coordinate and Z-depth values), color values, lighting values, texture coordinates, and the like, and the graphics may be processed in the early stages through manipulation of the parameter values of the vertices on a per-vertex basis. Operations in the early stages may include vertex shading computations to manipulate the parameters of the vertices in virtual space, as well as optionally tessellation to subdivide scene geometries and geometry shading computations to generate new scene geometries beyond those initially set up in the application stage. Some of these operations may be performed by programmable shaders, including vertex shaders which manipulate the parameter values of the vertices of the primitive on a per-vertex basis in order to perform rendering computations in the underlying virtual space geometry.
To generate images of the virtual world suitable for a display, the objects in the scene and their corresponding primitives are converted from virtual space to screen space. Intermediate stages may include various operations to determine the mapping of primitives to a two dimensional plane defining the screen space. Rasterization processes are used to sample the processed primitives from the early stages at discrete pixels in screen space defined for the rasterizer, as well as generate fragments for primitives which are covered by samples of the rasterizer. These intermediate operations associated with the rasterization of the scene to screen space may also include operations such as clipping primitives outside the viewing frustum of the current view and culling back-faced primitives hidden from the current view as an optimization to avoiding processing fragments that would result in unnecessary per-pixel computations for primitives that are occluded or otherwise invisible in the final image. The parameter values used as input values for each fragment are typically determined by interpolating the parameters of the vertices of the sampled primitive that created the fragment to a location of the fragment's corresponding pixel, which is typically the center of the pixel or a different sample location within the pixel, although other interpolation locations may be used in certain situations.
The pipeline may then pass the fragments and their interpolated input parameter values down the pipeline for further processing. During these later stages, per-fragment operations may be performed by invoking a pixel shader (sometimes known as a “fragment shader”) to further manipulating the input interpolated parameter values, e.g., color values, depth values, lighting, texture coordinates, and the like for each of the fragments, on a per-pixel or per-sample basis. Each fragment's coordinates in screen space correspond to the pixel coordinates and/or sample coordinates defined in the rasterization that generated them.
In the simplest case, a single sample is used per pixel corresponding to the pixel center, and a single fragment is processed for the primitive covering the pixel center. If that fragment passes a depth test, e.g., it is not occluded by another primitive at the same screen space location, then the output color values of the fragment computed by the pixel shader are written to a color buffer for those pixel coordinates, and possibly output depth values are written to a depth buffer if the pixel shader is programmed to export the depth value.
Sometimes, multiple sub-pixel samples are used for anti-aliasing, which may reduce the appearance of high frequency artifacts in sampled textures, as well as smooth jagged edges at primitive boundaries by allowing a given pixel in the color buffer to adopt a blend of output color values from different fragments computed from different primitives covering the different sub-pixel samples. Where multiple samples are used, each fragment's output may be applied to one or more sub-pixel samples covered by the primitive that generated it.
If conventional supersampling is used, a unique fragment is processed by the pixel shader for each sub-pixel sample, and its output is written to a color buffer at the sample coordinates, essentially treating the sample like a mini-pixel and rendering to a higher resolution. The higher resolution color buffer may then be down sampled to filter it down to the display resolution in the display buffer. Since a unique fragment needs to be processed by the pixel shader for each covered sample, the process is computationally demanding and significant shader overhead is introduced.
Conventional multisampling mitigates the drawbacks of supersampling somewhat by processing a single fragment with a pixel shader and applying its values to multiple covered samples in the color buffer. The simplest multisampling utilizes each sample for both color and depth, calculates and writes depth per sample as in super-sampling, and replicates a single output color per pixel to all covered samples in each pixel. New multisampling techniques, such as coverage sampling anti-aliasing (CSAA) and enhanced quality anti-aliasing (EQAA), have arisen recently which decouple some of the color samples from the depth samples in order to more accurately sample coverage of primitive edges within a rasterizer pixel's boundaries without the additional overhead that would be incurred by adding additional depth samples. With these multisampling techniques, there are typically more color samples than depth samples in the pixel (i.e., some samples are used only for color), and a fragment is processed by the pixel shader for a primitive anytime at least one sample in a pixel is covered, and the fragment's output color values may be applied to each covered sample in the color buffer.
Some new multi-sampling techniques also allow color samples to be decoupled from depth samples, such that more accurate depth information can be generated without increasing the size of color buffer data. However, these techniques consider even those samples which have only depth information to be shaded samples, and so invoke the pixel shader for any fragment in which any sample is covered even when no color samples are covered and the output color will be discarded. Unfortunately, pixel shader calculations are computationally expensive and introduce wasted computational overhead anytime the fragment's output values do not contribute to the final display pixel values in the rendered image. In video games and other instances of real-time graphics processing, reducing computational requirements and improving computational efficiency for rendering tasks is a critical objective for achieving improved quality and detail in rendered graphics. Moreover, with the recent advent of ultra-high definition (“ultra HD” or “4k”) displays having horizontal resolutions on the order of 4000 pixels, there is a need for more efficient graphics processing methods that can keep up with advances in display technologies.
It is within this context that aspects of the present disclosure arise.