1. Field of the Invention
The present invention relates to a method for improving 3D computer graphic performance, particularly to a method which pre-sorts display lists or display primitives according to their depth values and then discards invalid pixels through a Z-buffer pre-test process.
2. Description of the Related Art
Referring to FIG. 1, a well-known 3D computer graphic display system includes a host interface 11, a display memory 12, a memory controller 13, a 3D graphic controller 14, a screen controller 15 and a screen 16. The host interface 11, such as a well-known AGP bus, is used to receive control signals from a CPU (not shown) and be an interface of bi-directional transmission of all video data between the CPU and the 3D graphic controller 14. The graphic controller 14 is the most important element in the graphic system, and is used to generate all video data which the computer graphic display system needs. The 3D graphic controller 14 utilizes a memory controller 13 to control the display memory 12 in read/write cycles, and utilizes a screen controller 15 to control the display of the screen 16. The display memory 12 is utilized to store the video data displayed on the screen 16.
All the video data of a display frame shown on the screen 16 are represented by a plurality of objects, and sometimes the plurality of objects will overlap each other. In a transparently rendering process, also called xe2x80x9calpha-blending rendering process,xe2x80x9d all the objects overlapped at the same position are displayed simultaneously. In a non-transparently rendering process, only the object appearing on the top, that means the closest object to human eyes, will be displayed. In other words, all objects situated under the top object will be hidden. FIG. 2 is a flow chart of a well-known Z-buffer pre-test process in a 3D computer graphics. The Z-buffer pre-test process is used to discard invalid pixels in the 3D computer graphic system and to store the smallest depth value Zb among all pixels having been inputted. The smaller the depth value of an object is, the more the object is close to human eyes, that means the object with the smaller depth value is on the upper layer of the overlapped objects. The flow of the Z-buffer pretest process starts in step 21, that means a rendering process is started to execute. In step 22, the flow reads the depth value Zi of an input pixel being executed and a depth value Zb stored in the Z buffer. In step 23, Zb and Zi are compared to determine if the input pixel being executed is going to be displayed or discarded. If Zb is less than Zi, that represents the input pixel being executed is on the lower layer of the overlapped objects, step 24 is then executed. In step 24, the input pixel being executed is discarded, and repeats step 22. If Zb is larger than Zi in step 23, that represents the input pixel being executed is on the top layer of the overlapped objects, step 25 is then executed. In step 25, the flow replaces the depth value in the Z buffer with the depth value of the input pixel being executed, and then continues to execute other validity tests, such as a scissor test, stencil test, and so on. If the pixel passes all validity tests, it will be displayed later, and the flow ends in step 26.
If the pixels on the lower layer (invalid pixel) appear earlier than the pixels on the upper layer does, the invalid pixels would not be discarded in the Z-buffer pre-test process. When the pixels on the upper layer appears later, the invalid pixel having been shown will therefore be discarded in the Z-buffer pre-test process. Because the above-mentioned situation happens, the computer graphic display system will execute one or more memory access actions. Therefore, a lot of system resources are wasted and the executing speed is therefore slowed down.
The object of the present invention is to eliminate the drawbacks of wasting system resources due to accessing invalid pixels in prior art. To this end, the present invention provides a method for improving 3D computer graphic performance by pre-sorting. The method separates a plurality of display lists forming a display frame into static regions and reordered regions according to whether a transparently rendering process is executed or not. The display lists in the reordered region are sorted from the lowest to the largest according to their indicators representing the depth values of all display primitives of one display list. The display list with smaller indicator is placed at the front end of the reordered region, and the display list with larger indicator is placed at the rear end of the reordered region. The display lists in the static region are not executed the sorting action. After the sorting action is finished, the flow will enter a Z-buffer pre-test process. Because the display lists have been sorted from the lowest to the largest according to the indicators, the probability of accessing invalid pixels will be reduced, system resources will be largely eliminated and the displaying speed of video data will be speeded up.
The present invention mainly comprises steps (a) to (f). In step (a), the depth values of all display primitives of a display list are read, and an indicator to represent said depth values is computed. In step (b), if the display list is the last one in the display frame or if the region after the display list is a static region is checked. In step (c), if the answer of step (b) is no, the indicator is stored in the reordered region, and step (a) is then executed. In step (d), if the answer of step (b) is yes, the plurality of display lists are sorted according to the indicators, and being stored later. In step (e), if the display list is the last one in the display frame is checked.
In step (f), if the answer of step (e) is no, step (a) is executed; otherwise, the method of the present invention is finished.
Another method of the present invention is to represent the video data in the display frame directly with the plurality of display primitives. Because the depth values of all display primitives of a display list will not always be the same, the indicator of the display list can just approximate the depth values of all display primitives, but still create errors. If the video data in the display frame is represented with the plurality of display primitives directly, the indicator is unnecessary to compute again, and the errors would be avoided. The disadvantage of representing the display frame directly with the plurality of display primitives is to create more hardware cost.
If using a plurality of display primitives to form video data of a display frame, the present invention mainly comprises steps (a) to (f). In step (a), a depth value of a display primitive is read. In step (b), if the display primitive is the last one in the display frame or if the region after the display primitive is a static region is checked. In step (c), if the answer of step (b) is no, the depth value is stored in the reordered region, and step (a) is executed. In step (d), if the answer of step (b) is yes, the plurality of display primitives are sorted according to the depth values, and stored later. In step (e), if the display primitive is the last one in the display frame is checked. In step (f), if the answer of step (e) is no, step (a) is executed; otherwise, the method of the present invention is finished.
The present invention can be implemented by either software or hardware, and there are no limits on that. Because the present invention has the advantages of simple structure and less operations, no matter what kind of implementations has the advantages mentioned above.