In three dimensional graphics rendering, a graphics processing unit (GPU) may transform a three-dimensional virtual object into a two-dimensional image that may be displayed on a screen. The GPU may use one or more graphics pipelines for processing information initially provided to the GPU, such as graphics primitives. Graphics primitives are properties that are used to describe a three-dimensional object that is being rendered. By way of example, graphics primitives may be lines, triangles, or vertices that form a three dimensional object when combined. Each of the graphics primitives may contain additional information to further define the three dimensional object such as, but not limited to X-Y-Z coordinates, red-green-blue (RGB) values, translucency, texture, and reflectivity.
A critical step in a graphics pipeline is the rasterization step. Rasterization is the process by which the graphics primitives describing the three-dimensional object are transformed into a two-dimensional image representation of the scene. The two-dimensional image is comprised of individual pixels, each of which may contain unique RGB values. Typically, the GPU will rasterize a three-dimensional image by stepping across the entire three-dimensional object in raster pattern along a two dimensional plane. Each step along the line represents one pixel. At each step, the GPU must determine if the pixel should be rendered and delivered to the frame buffer. If the pixel has not changed from a previous rendering, then there is no need to deliver an updated pixel to the frame buffer. Therefore, each raster line may have a variable number of pixels that must be processed. In order to quickly process the three-dimensional object, a plurality of rasterization threads may each be assigned one or more of the raster lines to process, and the rasterization threads may be executed in parallel.
When a GPU is being emulated through software, the processing capabilities may not be as efficient or as highly optimized as they would be in the original hardware based GPU. Therefore, if the processing load on each rasterization thread is not properly balanced, a delay or latency in the execution of the rasterization may develop. Further, it is difficult to predict the number of pixels that will be rendered along each raster line before it is processed. Without knowing a priori the processing load each rasterization thread is assigned, it is difficult to ensure that load can be evenly balanced.
In order to prevent latencies, the emulation software may dedicate an increased number of available rasterization threads to the rasterization process. This increases the demand on the processor running the emulation software. Also, in the case of cloud-based services, the number of instances of the emulation software that will be running at a given time is not known beforehand. If the emulation software requires extensive processing power, then scaling the system for increased users becomes prohibitively expensive. By way of example, during peak usage hours, there may be many instances of the emulator being executed on the network. This requires that resources such as processing power be used as efficiently as possible.
Further, the efficiency of the processing power cannot be made by decreasing the frame rate that the emulator is capable of producing. The frame rate should ideally remain above 24 frames per second in order to ensure smooth animation. In order to provide a scalable software emulator of a GPU that is implemented over a cloud-based network, a rasterization method that allows for efficient load balancing is needed.
It is within this context that aspects of the present disclosure arise.