Massive computing capability has traditionally been provided by highly specialized and very expensive supercomputers. As technology advances, however, inexpensive desktop and server hardware has steadily supplanted expensive high end systems. More recently, inexpensive hardware has been gathered together to form computing clusters. The individual computers in a compute cluster are typically not as expensive or reliable as their supercomputer and mainframe forbearers but overcome those limitations with sheer numbers.
The drawback of compute clusters is that they are difficult to maintain and to program. In order to harness the power of a compute cluster, a program must be split into a great number of pieces and the multitudinous results later reconciled and reassembled. Furthermore, the program itself must be fault tolerant because there is a risk of individual failures amongst the great number of inexpensive computers.
Desktop and gaming computers often conserve central processing unit (CPU) resources by employing a graphics subsystems dedicated to drive one or more computer displays. A graphics processing unit (GPU) is at the heart of the graphics subsystem. The CPU is a general purpose processor designed to efficiently run a great variety of algorithms. Graphics processing, however, consists of a limited and well known set of algorithms. GPUs are specialized processors that are very good at graphics processing but not necessarily good at other tasks.
Another recent development is the identification of algorithms, other than graphics algorithms, that are well suited for GPUs. These algorithms currently require expert programming in order to put them into a form that a GPU can run. Further optimization is required to for a GPU to run the algorithm well. The effort is often worthwhile because the speedup can be orders of magnitude faster. Unfortunately, properly configured computing systems having the software tools required for developing algorithms to run on GPUs are rare. As such, expertise in the required programming techniques is rare and difficult to develop.
Systems and methods for providing GPU powered compute clusters and for deploying non-graphics applications to efficiently run on those GPU powered compute clusters are needed.