1. Technical Field
The present invention relates in general to a method and apparatus for producing a graphical image within a computer graphics system and in particular to a method and apparatus for parallel processing of graphics structure elements within a computer graphics system. Still more particularly, the present invention relates to a method and apparatus for displaying graphics structure elements processed by a parallel processor system in a computer graphics system in sequential order.
2. Description of the Related Art
Data processing systems such as personal computers and work stations are commonly utilized to run computer-aided design (CAD) applications, computer-aided manufacturing (CAM) applications, and computer-aided software engineering (CASE) tools. Engineers, scientists, technicians, and others employ these applications daily. These applications involve complex calculations, such as finite element analysis to model stress in structures. Other applications include chemical or molecular modelling applications. CAD/CAM/CASE applications are normally graphics intensive in terms of the information relayed to the user. Data processing system users may employ other graphics intensive applications such as desk top publishing applications. Generally, users of these applications require and demand that the data processing systems be able to provide extremely fast graphics information.
The processing of a graphics data stream to provide a graphical display on a video display terminal requires an extremely fast graphics system to provide a display with a rapid response. It is desirable to be able to provide the performance required utilizing presently available technology. In order to meet the performance demands of users employing graphics applications, multiple floating point processors have been utilized to provide the computational power needed for higher performance.
Such multiprocessor graphics systems process data streams that include models containing primitives. A "primitive" defines the shape of various components of an object, such as lines, points, polygons in two or three dimensions, text, polyhedra, or free-form surfaces in three dimensions. A primitive also may define attributes, such as line style, color, or surface texture. Also, a primitive may include data defining connectivity relationships and positioning information that describe how the components of an object fit together. "Output primitives" are sent to a geometry engine, which is employed to perform calculations such as transformations, clipping, lighting calculations, perspective projections, color mapping, etc. "Raster primitives" are primitives resulting from the output from geometry engine which are sent to a rasterization engine, which is employed in operations such as the transformation of data into pixels and the evaluation of pixel values.
Often, primitives must be processed in sequential order. Many applications require that the geometry engine, utilized to process primitives and other data for display, maintain the sequential order of the primitives as received. For example, the painter's algorithm, a simplified version of a depth-sort algorithm, is a hidden-line/hidden-surface algorithm that is frequently employed in computer graphics applications to paint or render closer objects over more distant objects in situations not involving intersection of objects. This algorithm produces various primitives that are sent into a frame buffer in the order of farthest from the viewpoint to closest to the viewpoint in the viewing coordinate system in order to arrive at a correct visibility solution.
Normally, primitives are sorted based on Z-axis depth, (hereafter Z-depth), to ensure that the primitives are drawn from farthest to closest. The list of primitives, resulting from such a sorting, becomes the order in which the primitives are displayed. Other methods requiring processing of primitives to be maintained in sequential or temporal order (i.e., the time order in which the primitives are received) include binary spatial partitioning (BSP) algorithms, octree algorithms, and constructive solid geometry (CSG) trees.
With the introduction of multiple instruction multiple data (MIMD) pipelined geometry engines, it is possible, and often desirable, to process primitives in a non sequential order. Normally, each primitive is processed in sequential order and each subdrawing primitive derived from a primitive is given to the raster engine in display list priority. Primitive are rasterized by the raster engine to produce an image for display. "Rasterization" is the process of determining pixel values from primitives. In parallel processing systems, however, it is sometimes desirable to process a graphics data stream in a sequence other than that received to achieve efficient use of processor resources. Maintaining the original sequence of the primitives while maintaining efficient processor utilization is necessary in some cases.
One example where it is advantageous to process primitives in a non-sequential order involves a large nonuniform, rational B-spline (NURBS) surface followed by a large number of small triangle strips. It is often the case that the evaluation of NURBS surfaces requires a significantly longer time as compared to the calculations required for other primitives, such as lines, polygons, or triangle strips. If the processing of the NURBS is sequential, then whenever a NURBS surface is encountered, all other processors will remain idle after filling any available output buffers. Allowing the other processors to work on the lines, polygons, or other data while the NURBS surface is being processed improves the overall throughput by an amount equal to the reduced processor idle time.
One approach to this problem involves utilizing a mechanism to merge or multiplex data from a number of microprocessors and send the data to a rasterizer. Buffers are associated with each of the microprocessors for temporarily storing data not yet sent to the rasterizer. This solution is adequate in some situations. Cases, however, exist in which the limits imposed by this buffering mechanism cause processing at one node to halt until the primitive being processed at another processor or processor node has been completed. An example of one such case is a large primitive followed by one or more smaller primitives.
One method for dispatching work to a series of parallel processors in which the maintaining of a specific sequential order is necessary involves employing a control processor to parse the input data stream. This control processor also dispatches work in the form of primitives to one of a group of parallel processors. The parallel processor chosen for the next primitive is one that has the least amount of work awaiting at the input buffer of the parallel processor. As each primitive is dispatched by the control processor, a sequential tag is associated with each primitive. This tag accompanies the primitive and is utilized to merge the primitives processed by the various parallel processors back into the original sequential order prior to rasterization.
Utilizing this method, if one processor receives a NURBS surface, other processors within the parallel processing system may be forced to remain idle until the NURBS surface is completed by the processor because of the NURBS surface's position in the original display list. Consequently, the degree of parallelism possible is directly linked to the size of each processor's output buffer.
Although this method maintains the display list priority order by employing a tagging mechanism to maintain the order of primitives, situations involving inefficient processors use still exist. Additionally, the raster engine will have to wait for the geometry processing of one primitive to be completed before rasterization of another primitive can begin. Increasing the amount of memory for the processors may be inappropriate since an inordinate amount of storage space may be required to maintain efficient processor utilization for common data streams. Furthermore, increasing the buffer size has the undesirable effect of increasing the latency, that is, the start-up and emptying time associated with the graphics system.
Therefore, it is desirable to have a method and apparatus to improve the efficiency of parallel processing of data streams containing primitives while displaying the processed primitives in sequential order.