Applications employing SIMD instructions, as the acronym suggests, have revolutionized the computing industry by providing an efficient way to simultaneously execute a single instruction on a large data set. Although SIMD instructions can be applied to almost any computing application, modern shader programs are one such example of an application employing SIMD instructions.
A shader program corresponds to object code data compatible with a specific graphics processor. The object code data is generated from data generally contained within an application module. In compiling the data, compilers generally create and maintain data representing control flow information for the shader program. The control flow information is an abstract representation of the program organized in blocks, each block containing one or more statements. The control flow information represents all possible alternatives of control flow (i.e., program flow) and is used to properly compile the data. Thus, the control flow information is representative of the shader program itself.
Shader programs are generally used by graphics processors to execute instruction statements across groups of pixels called grids. Conventionally, each grid may contain multiple pixels called neighbors. In the event that a grid contains four neighbors, the grid is termed a quad. Because modern shader programs are designed to operate on grids, each of the instructions in a shader program can be labeled a SIMD instruction. That is, a single SIMD instruction operates on each pixel in a grid, thus adding an important degree of efficiency to the shader program.
At the same time, shader programs often incorporate data dependent control flow structures each including one or more data dependent braches. Each data dependent branch includes a different path for the shader program to take based on a conditional statement. One example of a data dependent control flow structure is an “if/then/else” statement and all statements associated therewith. If the data meets a given condition in the “if” statement, the “then” data dependent branch is selected. If, however, the data fails the given condition in the “if” statement, then the “else” data dependent branch is selected. Shader programs, and programs in general that utilize data dependent control flow structures, execute faster because they do not compute all statements for all pixels in a grid. Thus, it is generally advantageous to place many instruction statements in data dependent control flow structures.
When a shader program reaches a data dependent control flow structure, one or more pixels in the grid may be forced to take one data dependent branch while the remaining pixels may be forced to take the alternate data dependent branch. Where pixels in a grid take alternate data dependent branches during execution of data dependent control flow structures, the processor needs to idle those pixels that do not take the first branch while executing the first branch of statements with respect to the remaining pixels. Upon completion of the first branch, the processor must then idle those pixels that took the first branch of statements while executing the second branch on the remaining pixels.
Shader programs also include instruction statements utilizing area operators, each acting as a function. The area operator function is defined in an area operator definition instruction statement. The area operator function is subsequently used in an area operator use instruction statement. For example an area operator definition instruction statement may resemble: Y=f(X), where: f( ) is the area operator, X is a previously determined operand (sometimes called an index value) and Y is the resultant of the area operator definition instructions statement. One example of an area operator definition instruction statement is an area operator gradient operation typically performed in texture sampling where f(X) may correspond to the gradient of X with respect to either the horizontal or vertical axis in screen space (x,y). The instruction statement that generates X may be labeled a source instruction statement because it defines a resultant, X, that is needed to compute the area operator definition instruction statement. An area operator use instruction statement may resemble: Z=Y, where: Y is the use of the resultant of the area operator definition instruction statement and an operand in the area operator use instruction statement and Z is the resultant of the area operator use instruction statement.
Area operator instruction statements, like other SI instructions, operate on each pixel in a grid. However, unlike ordinary SIMD instructions, area operator instruction statements are dependent upon data computed during the execution of at least one other pixel in the grid. That is, for each pixel in a given grid, the resultant of an area operator definition instruction statement is based on at least one source operand (i.e., X) of at least one of its neighbors. For instance, in the texture sampling example, the area operator definition instruction statement (i.e., the gradient of X with respect to the vertical or horizontal axis for one pixel in a grid) depends upon the value of X for at least one other pixel in the grid.
Because data dependent control flow structures essentially skip the execution of some instruction statements for some pixels in a grid, program developers who design and write the source code data for the shader programs cannot place area operator definition instruction statements within data dependent control flow structures. Area operator definition instruction statements are kept outside of data dependent control flow structures to insure that the area operator is defined for all data dependent paths associated with a data dependent control flow structure. Thus, developers can insure that every instance of an area operator use instruction statements will be executed properly (i.e., each area operator use instruction statement has known values for its operands).
Consequently, prior art shader programs (more specifically, the source code thereof) are written and executed in two parts. The first part places all area operator definition instruction statements outside data dependent control flow structures thus applying each area operator definition instruction statement to each pixel. A second part of the shader program makes use of the control flow information to discard the resultant data of the area operator definition instruction statement for those pixels that, according to the control flow information, will not require a use of the area operator definition instruction statement. Consequently, the application of each area operator definition instruction statement for each pixel and the subsequent discarding of resultant data is a drain on system resources, decreases efficiency and increases processing time.
The only know prior art alternatives to writing shade program sources in two parts requires program developers to split, if possible, statements containing an area operator definition instruction component and a non-area operator component. For example, a texture fetch statement may include an implicit area operator definition instruction component such as a texture sampling gradient operation (e.g., Y=g(X, f(X))) where g( ) represents the overall texture fetch statement while f( ) represents an implicit area operator definition instruction statement). In this case the overall statement could be split into its component parts. The first component, g( ), can be placed inside a data dependent control flow structure because the execution of g( ) does not depend upon a source instruction of one of its neighbors. The second component, f( ), however, must remain outside the data dependent control flow structure for the reasons articulated above.
While these solutions are more efficient than a shader program compiled with all area operator definition statements and their associated components located outside each data dependent control flow structure, each is plagued with possible or realized pitfalls. First, each alternative is only a partial solution to the identified problem. That is, while some statements are moved into the data dependent control flow structures, none of the area operator definition instruction statements themselves are incorporated therein. Thus, it is conceivable that the shader program cannot execute as efficiently without area operator definition instruction statements placed within data dependent control flow structures. The second problem arises from the significant amount of time required of a program developer to split statements incorporating implicit area operator definition instruction statements into component parts and to physically write the source code data such that a first component remains outside data dependent control flow instruction structures while a second component is placed inside data dependent control flow instruction structures. In order to prevent the possibility of computer bugs and errors, it is recognized that these solutions are time consuming, precarious and complex.
Thus, a need exists for an improved compiling scheme that overcomes one or more of the above drawbacks.