GLSL and GPU Programming

CS 441 Lecture, Dr. Lawlor

This code renders the Mandelbrot Set on the CPU:
const int wid=320, ht=256;
char pixels[wid*ht];
int render_mbrot(void)
{
for (int y=0;y<ht;y++)
for (int x=0;x<wid;x++)
{
float cx=x*(3.0/(float)wid)-1.5;
float cy=y*(2.0/(float)ht)-1.0;
float zr=cx, zi=cy; // complex number
int count=0;
for (count=0;count<255;count++) {
// z = z^2 + c = (zr+i*zi)^2 + c
float nzr=zr*zr-zi*zi + cx;
float nzi=2.0*zr*zi + cy;
if (nzr*nzr+nzi*nzi > 4.0) break;
zr=nzr; zi=nzi;
}
pixels[y*wid+x]=((unsigned char)(zr*255.0/4.0))&0xff;
// (count&0xff); //<- classic iteration count render
}
return 0;
}

int foo(void) {
double t=time_function(render_mbrot);
double nspix=(t*1.0e9)/(wid*ht);
std::cout<<"render time: "<<nspix<<" ns/pixel\n";

// Write a greyscale PPM output file (so we can see what we've rendered)
std::ofstream os("out.ppm",std::ios_base::out|std::ios_base::binary);
os<<"P5\n";
os<<wid<<" "<<ht<<" 255\n";
os.write(pixels,wid*ht);
os.close();
return 0;
}

(Try this in NetRun now!)

On one core of my Intel Q6600 quad-core machine, this takes 700 ns/pixel.  Using SSE or multiple cores, I could probably get a 3-10fold speedup from this.

But compare this with the corresponding GLSL code, which runs on the GPU:
float cx=texcoords.x*(3.0)-1.5;
float cy=texcoords.y*(2.0)-1.0;
float zr=cx, zi=cy; // complex number
int count=0;
for (count=0;count<255;count++) {
// z = z^2 + c = (zr+i*zi)^2 + c
float nzr=zr*zr-zi*zi + cx;
float nzi=2.0*zr*zi + cy;
if (nzr*nzr+nzi*nzi > 4.0) break;
zr=nzr; zi=nzi;
}
gl_FragColor = fract(vec4(zr,zi,count,1.0));

(Try this in NetRun now!)

This runs at 3.2ns/pixel on my NVIDIA GeForce GTX 280, which has 240 cores.  The speedup is a factor of 218x!