1. Field of the Invention
The present invention generally relates to computer systems, and more particularly to the generation of computer graphics employing an improved method of balancing graphics workloads using multiple concurrent rendering processes to refresh the computer display, the method utilizing a novel tiling strategy.
2. Description of Related Art
The basic structure of a conventional computer system 10 is shown in FIG. 1. Computer system 10 has at least one central processing unit (CPU) or processor 12 which is connected to several peripheral devices, including input/output devices 14 (such as a display monitor, keyboard, and graphical pointing device) for the user interface, a permanent memory device 16 (such as a hard disk) for storing the computer's operating system and user programs, and a temporary memory device 18 (such as random access memory or RAM) that is used by processor 12 to carry out program instructions. Processor 12 communicates with the peripheral devices by various means, including a bus 20 or a direct channel 22. Computer system 10 may have many additional components which are not shown, such as serial and parallel ports for connection to, e.g., modems or printers. Those skilled in the art will further appreciate that there are other components that might be used in conjunction with those shown in the block diagram of FIG. 1; for example, a display adapter connected to processor 12 might be used to control a video display monitor, and a memory controller may be used as an interface between temporary memory device 18 and processor 12. Computer system 10 also includes firmware 24 whose primary purpose is to seek out and load an operating system from one of the peripherals (usually permanent memory device 16) whenever the computer is first turned on.
With further reference to FIG. 2, conventional computer systems often employ a graphical user interface (GUI) to present information in a graphical form to the user. In the example of FIG. 2, a generic application program entitled "Document Manager" is presented by the GUI as a primary application window (parent window) 26 on a display device 28 (i.e., video monitor). In this example, the application window has several secondary, enclosed windows (child windows) 30, 32 and 34 which depict the contents of various files that are handled by the program. The file depicted in window 32 is a video animation (moving images). A menu bar 36 with a standard set of commands, a toolbar 38, and a status bar 40 may be provided as part of the GUI, to simplify manipulation and control of the objects (e.g., text, charts and graphics) within the child windows, using a graphical pointer 42 which is controlled by a pointing device (mouse).
As computer systems have become more powerful, applications have been developed which present much more graphic-intensive data, including video animation. The presentation of such multimedia information can often strain the abilities of the computer system, since it may take a significant amount of time for the operating system and hardware to repaint (refresh) the display screen each time a change is to occur in the video output. The present invention is directed to a method of handling graphics workloads which are used by the computer system to repaint the screen.
One strategy for speeding up the generation of computer graphics consists of utilizing multiple concurrent rendering processes to perform animation. A computer program (including an operating system which is responsible for displaying a GUI) can be broken down into a collection of processes which are executed by the processor(s). The smallest unit of operation to be performed within a process is referred to as a thread. The use of threads in modern operating systems is well known. Threads allow multiple execution paths within a single address space (the process context) to run concurrently on a processor. This "multithreading" increases throughput in a multi-processor system, and provides modularity in a uni-processor system.
In this context, a "rendering process" refers to either a thread of an application or a distinct application. Additionally, "concurrent rendering processes" refers to two or more threads of an application, two or more applications on the same system, two or more applications on different systems, or a combination of these, all rendering portions of an image (concurrent rendering processes can be distributed among several processes in a cluster). Thus, by dividing up the task of rendering an image between multiple rendering processes, substantial performance gains may be obtained. The difficultly lies in dividing up the workload to be performed in an efficient manner.
One possible way in which the work may be distributed among the available rendering processes is to subdivide the display into multiple rectangular regions, or tiles, and then assign one or more tiles to a process. While this simple technique does distribute the work to the available processes, it may not be optimal in the sense that each process may or may not perform an equal share of the work. Consider the simple problem in which a window is divided vertically into right and left sides (only two tiles). If an object (image) is rendered in the center of the window, then the workload will be approximately equal between the two tiles, but if the object is located in the right tile then the process associated with this tile will do all the work leaving the process associated with the left tile with very little work to perform.
A common approach to solving this workload distribution problem is to simply subdivide the window into a large number of fixed size tiles, and then assign to each rendering process a set of several tiles which are distributed over the entire window. The set of tiles is determined in a static manner (i.e., according to a predetermined function). The expectation with this approach is that, no matter where an object is placed, the rendering workload will be distributed relatively equally among the rendering processes. Certainly, as the number of tiles increases, the workload will become more evenly distributed with this approach. Thus, while this strategy can be effective, it can also lead to a great deal of complexity in designing a graphics subsystem in which a single process can efficiently render multiple tiles, due to the additional burden of managing multiple tiles. This overhead burden can adversely affect graphics generation, wiping out any performance gains that might otherwise be achieved using tiling.
None of the prior art techniques for handling graphics workloads address the fundamental problem of how to optimally distribute the rendering workload among several processes. It would, therefore, be desirable to devise an improved method of balancing graphics workloads when a graphics animation sequence is rendered by multiple processes, which addresses this problem. It would be further advantageous if the method were easily scalable to any number of processes.