The present invention pertains to the field of computer processing. More specifically, the present invention pertains to instructions utilized by integrated circuits for processing of data, such as three-dimensional graphics geometry processing.
Computer-generated graphics design generally consists of instructions implemented via a graphics program on a computer system. The instructions are recognized by the computer system""s processor and so direct the processor to perform the specific calculations and operations needed to produce three-dimensional displays. The set of instructions recognized by the processor constitute the instruction set of that processor.
Computer-generated graphics design can be envisioned as a pipeline through which data pass, where the data are used to define the image to be produced and displayed. At various points along the pipeline, various calculations and operations are specified by the graphics designer, and the data are modified accordingly.
In the initial stages of the pipeline, the desired image is framed using geometric shapes such as lines and polygons, referred to in the art as xe2x80x9cprimitivesxe2x80x9d or xe2x80x9cgraphics primitives.xe2x80x9d The derivation of the vertices for an image and the manipulation of the vertices to provide animation entail performing numerous geometric calculations in order to project the three-dimensional world being designed to a position in the two-dimensional world of the display screen.
Primitives are then assembled into xe2x80x9cfragments,xe2x80x9d and these fragments are assigned attributes such as color, perspective, and texture. In order to enhance the quality of the image, effects such as lighting, fog, and shading are added, and anti-aliasing and blending functions are used to give the image a smoother and more realistic appearance. In the final stage, the fragments and their associated attributes are combined and stored in the framebuffer as pixels. The pixel values are read from the framebuffer and used to draw images on the computer screen.
The processes pertaining to assigning colors, depth, texturing, lighting, etc., (e.g., creating images) are collectively known as rendering. The specific process of determining pixel values from input geometric primitives is known as rasterization.
The graphics design process is implemented in the prior art utilizing a computer system architecture that includes a geometry engine and a rasterization engine that are coupled in series to form the graphics pipeline through which the data pass. The geometry engine is a processor for executing the initial stages of the graphics design process described above. The rasterization engine is a separate processor for executing the processes above collectively identified as rasterization. Because the geometry engine precedes the rasterization engine in the graphics pipeline, the rate at which the rasterization engine can process data is limited by the rate at which the geometry engine can perform its calculations and forward the results to the rasterization engine. Thus, it is desirable to have a geometry engine capable of performing calculations at speeds that match the speed of the rasterization engine so that the geometry engine does not become a bottleneck in the graphics pipeline.
However, a problem with the prior art is that state-of-the-art rasterization engines are faster than comparable geometry engines, and so the geometry engine has become a limiting component in the graphics pipeline. Consequently, the speed at which the graphics process can be executed is slower than what could be achieved with an improved geometry engine, thus limiting the complexity of scenes which can be rendered.
One prior art solution to the above problem entails designing and implementing complex hardware dedicated to geometry calculations for computer-generated graphics, i.e., dedicated geometry engine hardware such as a dedicated processor. A problem with this prior art solution is that such dedicated hardware can be expensive. Another problem with this solution is that the dedicated hardware can typically only be used on those computer systems specifically designed for that hardware. Moreover, such specialized, dedicated hardware in the form of a dedicated processor typically utilizes an instruction set for which no compilers are available. Hence, all programming must often be done at the assembly or machine-language level. Such low-level languages are machine-dependent and therefore require knowledge of the specific processor. As such, dedicated processors offer somewhat narrow and cumbersome solutions to problems such as improved geometry processing.
Another problem with the dedicated geometry engine hardware is the explicit synchronization mechanisms that need to be implemented in the hardware and the software that use this hardware. Synchronization is needed to communicate the begin and completion points of the computation being done on the dedicated hardware.
Another prior art solution is to perform geometry calculations using the instruction set of a general purpose processor (instead of the dedicated processor discussed above). A general purpose processor, as the term is used herein, has an instruction set partly or wholly supported by a compiler and is therefore programmable to some degree using high-level languages (i.e., machine-independent languages such as C and Pascal). Such languages are easier to program than the low-level languages of the dedicated processor described above. Although portions of a general purpose instruction set may be unsupported by a compiler, advantages are still achieved through the ease with which assembly code may be linked to compiled code during the programming process. Although a general purpose processor is designed for a variety of applications, its actual use can be narrow. Additionally, to the extent a general purpose processor in a given application supports other tasks in addition to geometry calculations, then synchronization between the geometry calculations and these other tasks is implicitly resolved through processor programming.
A problem with this solution, however, is that many instruction sets are not powerful enough to quickly perform the complex calculations required for computer-generated graphics. Thus, the prior art is problematic because it typically takes several instructions to specify and perform an operation or function. In general, the more instructions specified, the longer it takes to perform the operation or function. Thus, geometry calculations are slowed by the number of instructions used in the prior art. It is therefore desirable to reduce the number of instructions, thereby increasing the speed at which a geometry engine can perform geometry calculations.
Accordingly, what is desired is a system and/or method that can increase the speed at which a processor (and, preferably, a general purpose processor) is able to perform geometry calculations for the graphics design process. What is further desired is a system and/or method that can accomplish the above and can also provide a cost-effective solution that can be implemented in computer systems using various types of processors and processor cores. The present invention provides a novel solution to the foregoing.
These and other advantages of the present invention will become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiments which are illustrated in the various drawing figures.
In accordance with the present invention, a system and method of same are provided that can increase the speed at which a processor is able to perform various operations including geometry calculations for a graphics design process. This system and method can accomplish the above and can also be a cost-effective solution that can be implemented in computer systems using various types of processors and processor cores. This system and method can reduce the number of instructions needed to specify and perform a given operation (e.g., geometry) and thereby facilitate an increase in the speed at which a processor operates.
In accordance with a preferred embodiment of the present invention, an application specific extension to a general purpose instruction set architecture is provided that incorporates high performance floating point operations designed to improve the performance of three-dimensional graphics geometry processing on a general purpose processor. Instructions included in the extension can use a variety of data formats including single precision, double precision and paired-single data formats. The paired-single format provides two simultaneous operations on a pair of operands. The instructions included in the extension may also be used in situations unrelated to three-dimensional graphics processing. Additionally, in an alternative embodiment, these instructions may be defined as part of the instruction set architecture itself rather than an extension to such architecture. These instructions may be carried out in hardware, software, or a combination of hardware and software.
The extension to the instruction set architecture can reduce the number of instructions needed to perform geometry calculations. As a result, a processor may be capable of performing geometry calculations at speeds approaching the speed of the rasterization engine, so that the processor is less likely to become a bottleneck in the graphics pipeline.
In one embodiment, the extension to the instruction set architecture is implemented as a set of floating point instructions that function with a MIPS- based instruction set architecture. In this embodiment, a processor comprising a floating point unit performs geometry calculations by executing the floating point instructions.
In one embodiment, a vertex in a computer graphics image is represented with coordinates. The coordinates are transformed, and the transformed coordinates are compared with a value representing edges of a specified view volume. Condition code bits are set to one or more specific states to indicate results of the comparison. A conditional branch instruction is executed based on the condition code bits.
In one embodiment, a branch target address is computed, a jump is made to the branch target address, and an instruction at the branch target address is executed dependent upon condition code state (i.e., the state of one or more condition code bits).
In one embodiment, a general purpose processor provides a plurality of bits set to one or more states from a storage device within the general processor. The plurality of bits are processed to generate a combined bit. A conditional branch instruction is performed based on the state of the combined bit.
In one embodiment, a first instruction and a second instruction are stored in a memory coupled to a general purpose processor. The first instruction is processed in the general purpose processor. The first instruction operates on a plurality of operands to perform a plurality of magnitude compare operations in parallel. A plurality of bits are set to one or more specific states in response to the magnitude compare operations. The second instruction is processed in the processor. The second instruction responds to the plurality of bits to selectively initiate a branch operation.
In one embodiment, a computer program product includes a computer-readable medium having a plurality of instructions stored thereon. A first instruction enables a general purpose processor to perform a plurality of magnitude compare operations in parallel and set a plurality of result bits to one or more specific states. A second instruction enables the general purpose processor to jump to a branch target address in response to the plurality of result bits.