Recent advances in computer performance have enabled graphic systems to provide more realistic graphical images using personal computers and home video game computers. In such graphic systems, a number of procedures are executed to “render” or draw graphic primitives to the screen of the system. A “graphic primitive” is a basic component of a graphic picture, such as a vertex, polygon, or the like. All graphic pictures are formed with combinations of these graphic primitives. Many procedures may be utilized to perform graphic primitive rendering.
Specialized graphics processing units (e.g., GPUs, etc.) have been developed to optimize the computations required in executing the graphics rendering procedures. The GPUs are configured for high-speed operation and typically incorporate one or more rendering pipelines. Generally, a typical GPU's rendering pipeline comprises a number of hardware-based functional units that are optimized for high-speed execution of graphics instructions/data, where instructions are fed into the front end of the pipeline and the computed results emerge at the bottom of the pipeline.
Graphics processing is typically performed using graphics application program interfaces (API's) that provide a standard software interface that can be run on multiple platforms, operating systems, and hardware. Examples of graphics API's include the Open Graphics Library (OpenGL®) and D3D™. In general, such open graphics application programs include a predetermined, standardized set of commands that are executed by associated graphics pipeline hardware. For example, in a computer system that supports the OpenGL® standard, the operating system and application software programs can make calls according to that standard without knowing any of the specifics regarding the system hardware. Application writers can use graphics APIs to design the visual aspects of their applications without concern as to how their commands will be implemented.
Graphics APIs are particularly beneficial when they are supported by dedicated graphics hardware. To improve graphics processing performance and overall graphics rendering speed, it is desirable that a large percentage of the graphics processing work is performed by the hardware of a graphics pipeline as opposed to software. For example, for high performance, graphics processing should be executed in hardware, wherein large portions of the processing work is executed on a per clock basis. In comparison, software can take hundreds of clock cycles to perform some graphics processing operations. For example, modern GPUs are designed and configured to rapidly and accurately process graphics commands with little impact on other computer system resources.
Problems exist, however, in those cases where graphics commands of the graphics API do not map efficiently to the functions and capabilities of a given GPU architecture. For example, Open GL includes a high level command that instructs the graphics hardware to render a circle (e.g., draw a filled circle at some location on screen, having some color/texture, etc.) to represent an antialiased point. Conventionally, the GPU's driver (e.g., software routines which interface with the hardware functionality of the GPU) has to perform a number of time consuming tasks in order to draw the specified circle.
In one prior art method for rendering an API requested circle, the specified circle is approximated with geometric primitives (e.g., polygons). This method is problematic due to the fact that polygon approximation adds a considerable amount of geometric primitives (e.g., triangles) to the graphics data stream. This would cause a considerable amount of additional work. In another prior art method, as opposed using a plurality of polygons to model a circle, a square (e.g., quadrilateral, two triangles, etc.) is defined and a texture is mapped onto the square. The texture is the image of a circle. For example, the texture mapped area within the circle is opaque and the area outside the circle is transparent. This solution is problematic due to the fact that pixels within the area that is outside the circle (e.g., in the corners of the square) must still be rasterized and shaded. This causes a significant amount of wasted work and overhead. Additionally, this solution consumes a certain amount of texture memory to store the picture of the circle. With both methods, the resulting circle should be anti-aliased in order to preserve the quality of the rendered image.
Another problem with both of the above conventional methods is the fact that the interpreting of the API request for the circle and the translation of this request into the graphics commands for the GPU causes an excessive amount of software branching. The excessive amount of software interpretation and branching tends to bog down the GPU pipeline until the constituent software can be executed. For example, for a typical application, a large number of graphics instructions execute rapidly (e.g., on a per clock basis), hence a graphics data stream can be efficiently processed by the GPU, with the GPU moving through the graphics data stream on a per clock basis until the OpenGL circle request is encountered, whereupon an exception is caused, and the software for handling the specified circle is invoked. There are a large number of conditions and parameters the software must set up. The setup process consumes multiple cycles, and imposes an excessive amount of software execution overhead on the driver. The exception can thus bog down the GPU pipeline for hundreds of clock cycles or more. Thus, what is needed is a more efficient way to render circles requested by a graphics API.