1. Field of the Invention
The present invention relates generally to the field of external memory access and, more specifically, to a system and method for accessing a frame buffer via a storage driver.
2. Description of the Related Art
A graphics processing unit (GPU) within a computing device is configured to efficiently process complex graphics and numerical computations. To take advantage of the processing capabilities of the GPU, an application, such as a CUDA application, executing on a central processing unit (CPU) within the computing device typically off-loads computationally expensive tasks to the GPU. In turn, the GPU processes the computationally expensive tasks and returns the results to the application.
Traditionally, the entire data set reflecting the processed results is returned to the application, even when the application may need to access only a subset of the data set. For example, suppose that the entire data set 500 kilobytes (KB), but the application needs to access only a 4 KB portion of the data set. In operation, the entire 500 KB data set would be transmitted by the GPU to the application so that the 4 KB portion can be accessed. Further, if the application were to modify the 4 KB portion of the data set, then the application would have to re-transmit the entire data set, including the modified 4 KB portion, to the GPU so that the GPU can perform additional processing operations on the data set or a portion of the data set. As the foregoing illustrates, in many situations, the entire data set being processed or accessed by the application is transmitted back and forth several times between the application and the GPU during operation.
One clear drawback to this architecture is that data transfer bandwidth between the CPU and the GPU is unnecessarily wasted since the entire data set is transmitted back and forth between the two processors when only a subset of the data set is needed. Further, architectures that require the transmission and retrieval of large quantities of data can substantially reduce the overall performance of the computer system.
As the foregoing illustrates, what is needed in the art is an efficient mechanism for providing more granular control when transmitting data between an application and a GPU.