The present invention relates generally to computer graphics subsystems, and, more particularly, to the synchronization of various parallel engines inside a graphics processing unit.
A graphics processing unit, or GPU, is a dedicated graphics processing device in a computer system or game console. It is a common practice for a GPU to contain several parallel processing structures, or engines, to carry out dedicated functions in order to improve GPU's performance. For instance, 3D engine only provides real-time 3D rendering. Other engines include 2D engine and master-image-transfer (MIT) engine, etc.
Even though these engines can run independently, they often lack adequate synchronization mechanisms among themselves in traditional computer systems, i.e., after an engine finishes a task, it has no mechanism to provide a notification of such an event. To facilitate a switch from one engine to another, the central processing unit, or CPU, has to insert a wait-engine-idle command, which blocks commands for other engines, hence hinders the engines to run fully parallel.
Such issues become a performance bottleneck especially in multi-GPU and multi-buffer applications. For instance, assuming there is a master GPU and one or more slave GPUs associated therewith, and in the slave GPU, after a 3D engine finishes a frame rendering, its master-image-transfer (MIT) engine begins to bit-block-transfer (BLT) the frame to a master GPU buffer. Ideally the 3D engine should be able to render a next frame right after the current rendering finishes, but without the proper synchronization mechanism, the 3D engine has to wait for the MIT engine to complete its BLT before proceeding to the next frame rendering. Here the term, “master GPU”, refers to a GPU having a direct connection to a display driver. The term, “slave GPU”, refers to a GPU that has no direct connection with the display driver and has to transfer its rendered image to the master GPU for display.
It is therefore desirable for a computer system to have synchronization means to allow various engines inside a GPU to run parallel to improve efficiency.