cudaMPI and glMPI: Message Passing for GPGPU Clusters

Software layer diagram: MPI and CUDA, cudaMPI, and application.

Download the open-source library:

cudaMPI, glMPI, and demos: version 2009-06 tar-gzip (or as a Zip)
Read the technical paper and presentation slides, both from the PPAC Workshop at IEEE Cluster 2009.

People are now using graphics processors (GPUs) to do general-purpose computing (GPGPU), and getting really quite good performance. So now people are considering GPGPU clusters, where you have a bunch of GPUs on a network. So you need a way for the GPUs to talk to one another.

You need a message-passing interface like MPI on the GPU. These libraries provide that.

For applications written using NVIDIA's CUDA, cudaMPI lets you call cudaMPI_Send and cudaMPI_Recv with a pointer to device memory. For applications written in OpenGL, glMPI lets you call glMPI_Send and glMPI_Recv with a rectangle of framebuffer or texture pixels (respectively).

Both libraries are simple one-file operations. cudaMPI provides a few more features, including nonblocking communication. Both libraries are designed to provide good performance, and get about a gigabyte per second of data transfer bandwidth assuming your network can keep up.

Please email Dr. Lawlor with bugs, fixes, or suggestions about glMPI or cudaMPI!

Dr. Lawlor
CS, UAF