1. Field of the Invention
The present invention relates to remote parallel processing, and particularly to a remote Graphics Processing Unit (GPU) programming and execution method that sends user specified GPU kernel functions and input datasets over a Web service to a remote computer equipped with a programmable GPU for execution.
2. Description of the Related Art
Advances in GPU hardware over the last several years have given rise to the ability to use GPUs for what has become known as general purpose GPU computing. General purpose GPU computing (GPGPU) is the ability to use graphics hardware for general computation in software for financial analysis, physics simulation, alternative graphics rendering, bioinformatics, and other applications that can benefit from a large-scale parallel computing platform.
In the early days of GPGPU computing there were no advanced frameworks that could be applied to general computational problems. The only APIs available to programmers were those used for graphics programming. Due to this limitation, programmers would try to manipulate their computational needs to resemble that of graphics problems. Once the representation of the problem had been manipulated into a new form, the programmer could then use a GPU to accelerate the runtime of their application. This technique added additional complexity and development time for the programmer.
The next logical evolution in GPGPU computing is a framework which allows the programmer to directly represent their problem as one that is solvable utilizing parallel hardware. This level of abstraction is shown in such frameworks as OpenCL, CUDA™, and C++ AMP.
These frameworks allow the programmer to avoid additional complexities when trying to represent problems as graphics problems and instead, represent them in their native form. An added benefit is the generalized representation of the graphics hardware that the framework presents to the programmer.
While the evolution in GPGPUs has developed to a point that makes their use easier for the programmer, there are still some limitations with the existing frameworks for use with GPGPUs. For example, each framework is limited to a few programming languages. Additionally, each framework requires that the hardware it will run on be directly connected to a GPGPU over a high-speed bus. This second limitation is constricting.
The first scenario that shows limitations is one in which the programmer has executed the application on a machine with a GPGPU and now wishes to execute it remotely from another machine. This is to say that the CPU piece of the program will be executed on the local machine, but the GPU piece of the program should be executed on a remote machine that contains a GPGPU. This scenario might occur given that within a small group of machines, there is one machine that contains a GPGPU and many more that do not. Providing that the GPGPU on the single machine is not being fully utilized, it may be beneficial to allow the other machines access to the single GPGPU to increase utilization of the GPGPU.
The second scenario is one in which a programmer has an operation that could be parallelized on a GPGPU device, but where the runtime improvement does not justify the purchase of a GPGPU. In this event it would be beneficial for the programmer to be able to dispatch this parallelizable operation to a remote service or device that contains a GPGPU that can perform the operation.
NVIDIA, a producer of GPU's, offers architectures and frameworks to allow programmers to use their GPUs for general computation. Exemplary NIVIDA GPGPUs include single GPUs connected to programming hardware, and additionally, clusters that include multiple host machines containing one or more GPGPU cards in each machine. These machines are visible to each other over a local network and provide a method for programmers to expand from one GPGPU to multiple GPGPUs within a data center setting. There remains, however, the problem of remotely accessing one or more GPGPUs from outside a data center setting.
Thus, a remote GPU programming and execution method solving the aforementioned problems is desired.