GPUs have long been used in stationary computers and are today also becoming an important technical feature of handheld devices such as mobile telephones. While originally intended for the acceleration of 3D graphics, GPUs are nowadays employed for a plethora of additional processing intensive graphics tasks such as 2D graphics rendering, composition of multiple graphics layers into a single image, image and video processing as well as user interface acceleration.
The inherent architectural parallelism makes GPUs particularly well suited for graphics tasks, but also in the field of general purpose computation there exist in many cases speed and power benefits when performing a task on a GPU rather than on a Central Processing Unit (CPU). Especially in heterogeneous embedded devices the CPU is often a critical resource, whereas the GPU is typically under-utilized. General Purpose GPUs (GPGPUs) are thus becoming increasingly widespread, and a corresponding standard (OpenCL) has recently been defined by the Khronos group.
WO 2009/111045 A1 describes a typical environmental architecture for a GPU. Graphics commands generated by client applications are asynchronously written to command buffers. A window server is configured to detect the generation of graphics commands by the client applications. The window server analyzes an individual image to determine if compositing processing is to be initiated for this image. During compositing processing the image is combined with one or more other graphics or video layers of other client applications, and corresponding compositing graphics commands are then stored in the command buffers.
A GPU driver reads sets of graphics commands from the command buffers in the order in which they were written by the client applications and the window server. The GPU driver has a batch generator module which prepares a batch of graphics commands from the graphics commands retrieved from the command buffers. Once prepared, the batch of graphics commands (corresponding to one frame of image data) is sent in a single transaction via a hardware command queue to the GPU.
The transmission of graphics command batches from the batch generator module to the GPU is controlled by a notification handler module of the GPU driver. The notification handler module receives notification messages from the GPU which indicate that the GPU is ready to receive additional commands. The notification messages are based on interrupts sent from the GPU to a CPU hosting the GPU driver.
One drawback of conventional GPU architectures such as the one described in WO 2009/111045 A1 is the fact that they do not prevent an individual application from monopolizing or even blocking the GPU. For this reason, GPU command schedulers have been proposed.
Mikhail Bautin, Ashok Dwarakinath and Tzi-Cker Chiueh: “Graphics Engine Resource Management”, Proceedings of 15th Multimedia Computing and Networking Conference, 2008, SPIE 28 Jan. 2008, proposes a GPU command scheduler that controls a GPU command production rate of an application through its CPU scheduling priority. Specifically, GPU commands are scheduled in such a way that GPU scheduling matches resource allocation decisions of a CPU scheduler. As a result, an equal share of GPU time can be allocated to each application regardless of the application-specific demand.
A software implementation of the GPU command scheduler suggested by M. Bautin et al. comprises a dedicated command queue for each application requesting GPU resources. GPU command groups are scheduled from these “per-application” command queues using a weighted round robin scheduling policy.
It has been found that conventional GPU command scheduling approaches still suffer from certain disadvantages. For example, the scheduling is typically application-centred, which means that the specific needs and possibilities of individual GPUs are not taken into account during the command scheduling procedure.
US 2008/303833 A1 discloses a method and an apparatus for notifying a sharing display driver to update a display with a graphics frame including multiple graphics data rendered separately by multiple graphics processing units (CPUs).
Chia-Ming Chang et al.: “Energy-saving techniques for low-power graphics processing unit”, International SoC Design Conference, 2008, IEEE, Piscataway, N.J., USA, discloses a GPU with energy-saving techniques, which allow to achieve high performance with low power consumption from algorithm, architecture and circuit levels.
US 2005/125701 A1 discloses a method and system for providing energy management within a processing system, which can reduce energy consumption by managing processes through intelligent scheduling of processes and in conformity with a measured level of energy use by each process.