Conventional computer architectures include central processing units (CPUs) optimized to serially process single tasks quickly and accelerated parallel processors, such as graphics processing units (GPUs) having a large number of cores optimized to process many tasks in parallel.
Message passing is utilized as a form of parallel communication between different sequences of programmed instructions (e.g., threads) of a program or application. In some applications, such as graph and network applications, data structures for these communications may be represented by messages.
Some computer architectures include CPU clusters (multiple CPUs) and programming models which pass messages between the CPUs. Conventional GPU programming models include Open Computing Language (OpenCL) and Compute Unified Device Architecture (CUDA)). For multi-node applications using GPUs, some conventional models use hybrid techniques, such as combining a Message Passing Interface (MPI) with an accelerator model. For example, hybrid models include combining MPI with CUDA and combining MPI with OpenCL (e.g., where each machine node includes a central processing unit (CPU) and an accelerator (e.g., GPU or accelerated processing unit (APU)).