To hasten the generation and display of increasingly complex computer-generated imagery, conventional graphics processing techniques include recursively rendering and combining previously generated images, whereby a single, highly detailed graphic image is formed. An algorithm implementing such a technique is referred to as a multiple pass (“multipass”) algorithm. To illustrate, consider a graphical processor unit (“GPU”) executing instructions of a video game application, those instructions including a multipass algorithm. In this example, the multipass algorithm renders and then stores an image of a computer-generated scene. Next, the multipass algorithm uses the stored scene as an input to render the scene in combination with another graphical image, such as with one or more characters. Thereafter, the image of the scene with the characters is available as an input for further rendering, where each additional pass adds other like graphical images, such as weaponry, special effects (e.g., muzzle flashes), etc., to the scene.
FIG. 1 is a block diagram of a traditional system for generating graphical images whereby a render target is used as both as a repository for finally generated images as well as a source of images used as texture for further rendering, for example, in subsequent passes of multipass rendering. System 100 includes a GPU 102 containing a shader 104, which operates to alter properties (e.g., lighting, transparency, color, texture, etc.), position, and orientation for surfaces of rendered objects. Shader 104 is typically a vertex shader, a pixel shader, or the like, and comprises any number of pixel pipelines 108. As shown in FIG. 1, shader 104 includes four pixel pipelines 108 for processing pixel data. To process the pixel data, shader 104 receives one or more textures 106, such as textures 106a, 106b, and 106n, for incorporating texture data into the pixel data. Filter 110 (e.g., an anisotropic filter), if employed, filters textures 106 to improve image quality when rendering three-dimensional (“3-D”) scenes. Textures 106 are static texture maps for application onto surfaces of 3D graphical objects, examples of which include surface appearances of walls, floors, ceilings, doors, and other structures where the textures do not change or otherwise animate.
Each of pixel pipelines 108 continues from shader 104 and extends to a render target 122 residing in graphics memory 120, which can be implemented as a frame buffer. Conventionally, render target 122 is an intermediary storage that is accessible as both as a target and a source of image data. That is, it is a target to which image data is written so computer-images can be displayed, and it is a source for providing a texture as input back into shader 104. By recursively writing to render target 122 and reading a texture from that render target, multiple passes can integrate complex visual effects into images of previous rendering passes.
But there are several drawbacks to the approach of using render target 122 as texture as input to further render graphical images. For example, synchronicity of multiple writes 130 to and reads 132 from render target 122 is computationally expensive, among other things, when managing those writes 130 and reads 132 in parallel, or during any overlapping interval of time. Since render target 122 is a shared resource (i.e., memory), writes 130 and reads 132 with respect to each pixel stored in render target 122 must be managed at a fine-grained level. That is, every memory location storing pixel data from each pipeline 108 is managed to prevent overlapping write and read operations from interfering with each other and corrupting the pixel data. Without properly ordering these operations, conflicting write and read operations would result in incorrect pixel data. And if GPU 102 implements multiple threads or an increased number of shaders 104, the amount of computations and/or hardware to synchronize the increased numbers of writes 130 and reads 132 becomes expensive. Further to this approach, latency is introduced into the multipass rendering of graphical images, especially when system 100 performs synchronization at fine-grained levels, such as when memory locations are “locked-out” (i.e., blocked against programming or otherwise altering). While any access to or from the render target is prohibited or locked-out, one or more pixel pipelines stall until such access is granted. This delays graphical image generation and thus hinders performance. These delays are relatively long because pixel pipelines 108 between render target 122 and 108 include numerous intermediary graphics subprocesses, such as depth testing, compositing, blending, etc.
In view of the foregoing, it would be desirable to provide an apparatus and a method for efficiently employing a render target as a texture. Ideally, an exemplary method would minimize or eliminate at least the above-described drawbacks.