The invention generally relates to computer graphics processing and, more particularly, the invention relates to graphics accelerators having parallel processors.
Graphics rendering devices commonly include parallel processors for improving processing speed. In some prior art systems, each parallel processor processes data for a relatively large preselected contiguous portion of a display device. For example, in a four parallel processor graphics accelerator, each processor may produce pixel data for one quadrant of the display device. Accordingly, when an image to be drawn is substantially within one of the quadrants of the display, only one processor is processing while the other processors remain relatively dormant. This can significantly slow system speed, thus decreasing system efficiency.
Other problems commonly arise in multi-parallel processor graphics accelerators such as, for example, graphics requests being processed out of a prescribed sequential order. When this happens, the processors often produce output pixel data that is out of sequence and thus, not an accurate depiction of the image being drawn. It therefore would be desirable to provide a parallel processing graphics accelerator that divides processing more evenly among the processors, while also maintaining the order of sequential graphics requests that ultimately are transformed into pixel data.
In accordance with one aspect of the invention, an apparatus for displaying a polygon on a horizontal scan display device having a plurality of pixels includes first and second rasterizers that each process respective first and second sets of pixels. Each set of pixels includes vertical stripes that are transverse to the horizontal scan of the display. To that end, the first rasterizer has an input for receiving polygon data relating to the polygon. The first rasterizer determines a first set of pixels that are to be lit for display of the polygon, and also determines display characteristics of the first set of pixels. In a similar manner, the second rasterizer also includes an input for receiving polygon data relating to the polygon. The second rasterizer similarly determines a second set of pixels that are to be lit for display of the polygon, and also determines display characteristics of the second set of pixels. The first and second sets of pixels have no common pixels and are vertical stripes of pixels on the display device that are transverse to the direction of the horizontal scan. In alternative embodiments, the display device has an arbitrary scan direction and the stripes are transverse to the arbitrary scan direction.
In preferred embodiments, the apparatus is a graphics accelerator having a first and second frame buffers, and first and second resolvers for transferring the display characteristics for the first and second sets of pixels into the first and second frame buffers, respectively. The first and second frame buffers may be formed on the same integrated circuit, or may be formed on different integrated circuits. In preferred embodiments, the first and second resolvers each include a plurality of resolvers. More particularly, the first resolver may include a first number of sub-resolvers, and the first frame buffer may be divided into a second number of frame buffer segments. Each sub-resolver may be assigned one frame buffer segment for exclusive use and thus, cannot transfer pixel data into other frame buffer segments. Each subresolver thus writes to its assigned frame buffer segment only.
In other embodiments, the first resolver includes first and second sub-resolvers. The first sub-resolver transfers display characteristics of a first sub-set of pixels to the first frame buffer while the second sub-resolver transfers display characteristics of a second sub-set of pixels to the first frame buffer. The pixels in the first and second sub-sets are members of the first set of pixels and each have pixels in the same vertical stripe.
In preferred embodiments, each vertical stripe includes a plurality of contiguous pixels. The first set of pixels includes a plurality of non-contiguous vertical stripes. The second set of pixels may include a plurality of non-contiguous vertical stripes. In some embodiments, each vertical stripe has a width of one pixel. Among other things, the display characteristics may include intensity information, color data, depth data, and transparency data.
The polygon data may include vertex data. In some embodiments, the vertex data define a triangle.
In accordance with another aspect of the invention, an apparatus for displaying an image (comprised of a plurality of polygons) on a display device having a plurality of pixels includes first and second gradient producing units that broadcast ordered sets of data in a preselected order to a bus. This preselected order maintains the order of the ordered sets of data.
Accordingly, in preferred embodiments of the invention, the apparatus includes the first and second gradient producing units, and the bus coupled to each of the gradient producing units for receiving the data broadcast. To that end, the first gradient producing unit has an input for receiving a first ordered set of polygons, where each polygon in the first ordered set is received in a first order. In a similar manner, the second gradient producing unit has an input for receiving a second ordered set of polygons, where each polygon in the second ordered set is received in a second order. The first and second gradient producing units each having respective outputs for respectively providing gradient data for the first and second set of polygons. Each polygon in the first and second ordered sets are members of the set of polygons. The bus is coupled to both the outputs of the first and second gradient producing units, and at least one rasterizer that processes the plurality of polygons for display on the display device. The first gradient producing unit output broadcasts the gradient data for the first ordered set of polygons in the first order. In a similar manner, the second gradient producing unit output broadcasts the gradient data for the second ordered set of polygons in the second order. In preferred embodiments of the invention, the second gradient producing unit output broadcasts the gradient data for the second ordered set of polygons after the gradient data of each polygon in the first ordered set of polygons is broadcasted to the bus.
In other embodiments, the apparatus for displaying an image includes a first rasterizer having an input for receiving the first ordered set of polygons, and a second rasterizer that also has an input for receiving the first ordered set of polygons. The first rasterizer determines a first set of pixels that are to be lit for display of each polygon in the first set of ordered polygons. In a similar manner, the second rasterizer also determines a second set of pixels that are to be lit for display of each polygon in the first set of ordered polygons. The first set of pixels and second set of pixels have no common pixels, while the first set of pixels and second set of pixels each are vertical stripes of pixels on the display device. Each vertical stripe preferably includes a plurality of contiguous pixels. The first set of pixels preferably includes a plurality of non-contiguous vertical stripes, while the second set of pixels also includes a plurality of non-contiguous stripes.
The first gradient producing unit preferably produces gradient values for each polygon in the first ordered set of polygons. The polygons in the set preferably are triangles having vertices and data relating to the vertices. The apparatus for drawing an image preferably is a graphics accelerator that draws the image in three dimensions (i.e., xe2x80x9c3Dxe2x80x9d).
In accordance with yet another aspect of the invention, a graphics accelerator for processing a graphics request stream includes first and second processors that each maintain control of a bus (at different times) until a flag is received at the end of the graphics request stream. To that end, the first processor includes a first input for receiving a first portion of the graphics request stream, and a first output for delivering a first unit output. In a similar manner, the second processor includes a second input for receiving a second portion of the graphics request stream, and a second output for delivering a second unit output. The bus is coupled with the first and second outputs and is configured to be controlled at a single time by no more than one processor. The first and second processors are arranged in a peer-to-peer configuration to process the graphics request stream on a cyclical basis. After gaining control of the bus, the first processor maintains exclusive control of the bus until a flag is received at the end of the first portion of the graphics request stream. No other processor can deliver output data to the bus when the first processor controls the bus.
In preferred embodiments, after the flag is received at the end of the first portion of the graphics request stream, the first processor transmits a message to the second processor. The message enables the second processor to control the bus. In preferred embodiments, the message includes a token.
In other embodiments, the graphics accelerator includes at least one additional processor. Each additional processor preferably includes an input for receiving an additional portion of the graphics request stream, and an output coupled with the bus. The first, second, and additional processors are arranged in a peer-to-peer configuration to process the graphics request stream on a cyclical basis. After gaining control of the bus, each additional processor maintains exclusive control of the bus until a flag is received at the end of the additional portion of the graphics request stream. More particularly, when one of the additional processors gains control of the bus, it maintains control of the bus until a flag is received at the end of the additional portion of the graphics request stream that such one additional processor is processing.
In preferred embodiments of the graphics accelerator, the flag includes the well known floating point value xe2x80x9cNot a Number.xe2x80x9d In other embodiments, the flag includes a bit that, when set to a first value and read by the first processor while controlling the bus, causes the first processor to maintain control of the bus. In other embodiments, the flag includes at least one bit that, when set to a second value and read by the first processor while controlling the bus, causes the first processor to enable the second processor to control the bus. The first value and the second value may be one and zero values, respectively, in one embodiment. In another embodiment, the first value and second value are zero and one, respectively.
In yet another embodiment of the graphics accelerator, the graphics request stream includes a set of polygon strips that are arranged in a preselected order. The first portion of the graphics request stream includes a first subset of the set of polygon strips, while the second portion of the graphics request stream includes a second subset of the set of polygon strips. The first subset precedes the second subset in the preselected order.
In accordance with still another aspect of the invention, a graphics accelerator includes a plurality of processors, where each processor has an input for receiving successive graphics requests, and an output for transmitting unit output data. The processors are arranged in a peer-to-peer configuration to process each successive graphics request on a cyclical basis, where each successive graphics request is terminated by a flag. The accelerator further includes a bus coupled with the output of each process to receive unit output data, where the bus is configured to be controlled by no more than one processor at a single time. When controlling the bus, a given processor maintains control unless the given unit detects that the flag in a given graphics request (that the given processor is processing) is set to a first value.
In accordance with still another aspect of the invention, a device for managing the communication of a sequence of data records associated with successive vertices, in a graphics accelerator having a plurality of processors coupled to an output bus in a peer-to-peer configuration, utilizes a flag to control bus access by the processors. To that end, the sequence of data records are placed in a data stream for receipt by the plurality of processors. A plurality of terminator data records are placed in the data stream between selected data records. Each terminator record further includes a flag that, when set and received by a given processor controlling the bus, causes the given processor to relinquish control of the bus to a second processor. Each record may include a floating point value providing at least a first datum associated with a vertex. The terminator data records each may have a floating point value where the first datum is set to a value corresponding to Not a Number. In other embodiments, the given processor may be controlled to transmit a token to the second processor upon receipt of the flag. In other embodiments, the given processor does not relinquish control of the bus.
In accordance with still another aspect of the invention, pass-through commands are managed by processors coupled to a bus on a graphics accelerator by first enabling a master processor to transmit the command, and then subsequently causing a processor that was interrupted by the command to resume control of the bus. To that end, the processors each have inputs for receiving a sequential stream of graphics request, and outputs that are coupled to the bus. The processors are arranged in a peer-to-peer configuration to process each successive graphics request on a cyclical basis. One of the processors is designated the master processor to transmit the pass-through command. Accordingly, when a pass through command is received at the input of one of the plurality of processors (the xe2x80x9creceiving processorxe2x80x9d), it is determined if the receiving processor is the master processor. If it is determined that the receiving processor is not the master processor, then control of the bus is passed to the master processor. Upon control of the bus, the master processor is controlled to transmit the pass through command. In addition, if it is determined that the receiving processor is not the master processor, then the receiving processor is the first of the plurality of processors to control the bus after the pass through command is transmitted.
In preferred embodiments, the plurality of processors pass a control token sequentially therebetween to pass control of the bus from processor to processor. In such embodiment, the control token is transmitted from the receiving processor to the master processor to enable the master processor to control the bus. The plurality of processors may include an intermediate processor between the master processor and the receiving processor. In such case, the control token is transmitted from the receiving processor to the master processor via the intermediate processor. Since the processors are in a peer-to-peer configuration, no external processor or logic device is necessary to control processor operations. The plurality of processors thus are self-controlling via the token passing mechanism. In preferred embodiments, the processors are gradient producing units.
In accordance with other aspects of the invention, a polygon is displayed on a horizontal scan device having a plurality of pixels by dividing the polygon into a plurality of vertical stripes that are transverse to the horizontal scan of the display device, and then calculating attribute data for each of the pixels on a stripe by stripe basis. More specifically, after the polygon is divided into stripes, pixel attribute data is received for a first pixel in a first stripe of the polygon. Each of the remaining vertical stripes have an initial pixel that corresponds to the first pixel in the first stripe. For example, if the first pixel is the bottom pixel of the first stripe, then each of the other stripes have an initial pixel that is the bottom pixel of such respective stripes. Gradient data relating to the degree of change of pixel attribute data with respect to the received pixel data (relating to the first pixel) also is received. Based upon the received data, pixel attribute data then is calculated for each initial pixel in each stripe in the polygon. Once the pixel attribute data is calculated for each initial pixel, then pixel attribute data for each remaining pixel in each stripe is calculated based upon the pixel attribute data for the initial pixel in each stripe in the polygon.
In preferred embodiments, the polygon is a triangle. Pixel attribute data for each remaining pixel in the first stripe may be calculated based upon both the pixel attribute data for the first pixel in the first stripe, and the gradient data.
It should be noted that although this and other aspects of the invention relate to horizontal scan display devices, other scan devices may be utilized. In many aspects of the invention, the vertical stripes must be transverse to the scan of the display device, regardless of whether it is horizontal scan or other scan.
In accordance with still other aspects of the invention, vertical stripes are utilized for calculating pixel values for a triangle to be displayed on a display device having a plurality of pixels that each are addressable in an X direction and a Y direction. To that end, a first number of processors are provided for calculating pixel attribute data for each pixel in the triangle. The triangle is divided into a set of vertical stripes that are perpendicular to a scan direction of the display device. Each stripe is originated from a longest edge of the triangle, where each processor calculates attribute data for different sub-sets of stripes. No two processors process the same stripe. Attribute data for an initial pixel in a first stripe is received for determining other pixel data attributes.
Other pixel data attributes are calculated by traversing along the longest edge of the triangle for a first distance until a first pixel the X direction of a next contiguous stripe is detected. The first distance then is multiplied by the first number of processors to produce a processor bump value. Each processor then is controlled to calculate attribute data for pixels in each respective sub-set of stripes based upon the processor bump value and the received attribute data for the initial pixel.
In preferred embodiments, the scan direction of the display device is horizontal. In preferred embodiments, gradient data based upon the attribute data for the initial pixel is received for the triangle. The gradient data indicates the change in attributes of the pixels from the initial pixel. Accordingly, attribute data of the pixels in each respective sub-set of stripes may be calculated based upon the gradient data. In preferred embodiments, each stripe has a width in the X direction of one pixel.
In some embodiments, a first processor calculates attribute data for the first stripe, and attribute data for a first sub-set of stripes that includes the first stripe. In such case, attribute data for the pixels in the first subset of stripes may be calculated by controlling the first processor to calculate initial pixel attribute data for initial pixels in all of the stripes in the first sub-set of stripes except for the first stripe. Of course, it is not necessary to calculate the attribute data for the initial pixel in the first stripe since that data is already available. The initial pixel values are calculated based upon the processor bump value and the received attribute data for the first pixel. The first processor then calculates each of the other pixel values in each stripe based upon the initial pixel attribute data of the initial pixel in each respective stripe. In some embodiments, a second processor calculates attribute data for a second subset of stripes. In such case, the second processor may calculate initial pixel attribute data for initial pixels in all of the stripes in the second subset of stripes. The attribute data for the initial pixels are calculated based upon the processor bump value and the received attribute data for the initial pixel. Attribute data for the initial pixels also may be based upon gradient data.
In accordance with yet other aspects of the invention, a cursor may be drawn on a display device (having a plurality of addressable locations) so that it does not entirely obscure images that it is covering. More particularly, a look-up table having cursor data for displaying the cursor on the display device is stored in a memory device. Upon receipt of an input signal identifying an addressable location on the display device, the look-up table is accessed to ascertain the cursor data. The cursor data preferably is accessed based upon the addressable location received in the input signal. A transparency value is then applied to the retrieved cursor data to produce less opaque cursor data. The cursor then is drawn on the display device based upon the less opaque cursor data.
In preferred embodiments, when using an OPENGL(trademark) graphics library, the transparency value is an alpha value of less than one. The addressable location received in the input signal also may be an X-Y value of a point on the display device. In some embodiments, the memory is located on a graphics accelerator that is coupled to the computer system. In such case, the graphics accelerator accesses the look-up table and draws the cursor on the display device.