Simulations on the GPU

CS 482 Lecture, Dr. Lawlor

Modern OpenGL talks to the graphics card using a variety of specialized software objects:

A texture stores an image, as a regular grid of pixels. Mostly these are 2D, although 3D textures also exist. Textures are stored on the graphics card, in graphics card memory, so they're incredibly fast to read and write. Texture pixels can either be read from a file or uploaded from JavaScript (using gl.texImage2D), or computed on the graphics card directly. To run on the graphics card, most simulations will need to store their simulation data in textures, probably using the four RGBA fields stored in gl.FLOAT format. Caution: if you're used to the CPU, textures have a number of serious limitations; for example, the size of a texture is limited to gl.getParameter(gl.MAX_TEXTURE_SIZE), which can be as low as 8192 pixels (8192 x 8192 pixels is 64 million pixels, 256 million floats, or only 1 gigabyte).
A framebuffer object lets you render into a texture, the same way you'd render pixels onto the screen. If texture pixels are computed at runtime, this is how you do it.
A GLSL shader (go read it!) consists of code that looks a whole lot like C++, but it gets compiled at runtime so it can run on your graphics card.

Vertex shaders compute the onscreen position (gl_Position) of the geometry vertex you're drawing. It can also compute "varying" values that are interpolated to the pixels, for use in the fragment shader.
Fragment shaders compute the onscreen color (gl_FragColor) of the pixel you're drawing. Today, all the fun stuff usually happens in the fragment shader.

The big limitation with a fragment shader is it can only write to the pixel you're currently rendering. People deal with this by:

Cram everything you need to write into your single RGBA color pixel. Any simulation handling up to four floats is trivial. I've seen really weird crazy stuff with packing multiple values into a single color, this is easiest with low-bit values on integer pixel colors, but can be done with floats as long as you're careful about the exponent.
Take multiple passes to compute what you need in several stages. For example, you can't write FEM strain and node forces in a single pass, so you do one pass to compute strains, then do a second pass to compute forces from the strains.
Use the WEBGL_draw_buffers extension to attach several different textures to the framebuffer, and write to gl_FragColor[i]--it's still just one pixel, but it can now have more than 4 floats. Some machines require all the textures to have the same pixel format, but this is usually just 4 floats per pixel anyway. MAX_DRAW_BUFFERS_WEBGL gives the number of textures supported, which can be as low as four textures per pass. Plus the extension is not supported everywhere yet.
Using the Image load/store extension to read and write arbitrary pixels (support for the extension is quite rare for now).

Programmable Shaders are Very Simple in Practice

GLSL is one of the standard languages today for describing the code used to draw pixels to the screen on modern graphics cards.

There are many opportunities to use GLSL in simulations. First, we can get much higher performance by running our simulation code in a pixel shader, which allows all the pixels to execute in parallel. The downside is we need to make all the inputs and outputs shader friendly, which can be trivial, or very difficult, depending on the simulation requirements.

Second, we can provide better display of our simulation's outputs, by including per-pixel information such as the camera direction, lighting, contrast, etc.

Data types in GLSL work exactly like in C/C++/Java/C#. There are some beautiful builtin datatypes:

float. Works exactly like C/C++/Java/C#.
vec4. A class with four floats in it, which you can think of as the XYZW components of a vector, or the RGBA components of a color. vec4 supports + - * / exactly like you'd expect. vec4 is the native datatype of the graphics hardware, so all of these operations are single-clock-cycle.

You can get to the first component of a vec4 named "v" as follows:

"v.x", treating the vec4 as a spatial position or vector.

"v.r", treating the vec4 as a color. This is the same data, the same speed, the same everything as ".x"; it's basically just a comment or a hint to the human reader that you're dealing with a color.

"v[0]", treating the vec4 as an array. Again, it's the same underlying data.

You can initialize a vec4 as follows:

"vec4 v=vec4(0.0);\n", sets all four components to zero.

"vec4 v=vec4(0.1,0.2,0.3,0.4);\n" sets all four components independently.
"vec3 d=vec3(0.1,0.2,0.3);\n"
"vec4 v=vec4(d,0.4);\n"
You can make a 3-vector into a 4-vector by just adding the missing components.

The "w" component is used for homogenous coordinates. It's 1.0 for ordinary position vectors, and 0.0 for direction or offset vectors. You care about this when you're deriving a new projection matrix, but otherwise you usually ignore it.
A vec4 makes a perfectly good quaternion, although you need to write all the math since nothing is built in.

vec3. A class with three floats in it. Doesn't have a ".w" or ".a" component. Useful for representing directions (surface normals, light directions, etc) when you don't want the "w" component messing up your dot products.
vec2. A class with just two floats. Missing ".z" or ".b" and ".w" or ".a". Useful for representing 2D texture coordinates, or complex numbers.
mat4, mat3, mat2. Matrices that operate on vec4's, vec3's, and vec2's. See my caveats on how to load up the matrix values (the constructor takes column-major order), or just load them from C++ via a builtin like gl_ModelViewMatrix.
"int" is fairly rare for computation (the graphics hardware usually doesn't have integer math!). Some drivers are very picky about distinguishing between "2" the integer and "2.0" the float.
A variable declared as "varying" gets transmitted from the vertex shader to the fragment shader. This is the only way to communicate between your vertex and fragment shaders!
A variable declared as "uniform" gets passed in from outside. If the GLSL code says "uniform float foo;", you:

In THREE.js, you set foo using code like "myshader.uniforms.foo.value=3;".
In Igloo, you call program.uniform("foo",3);
In BABYLON, a ShaderMaterial will accept a call like material.setFloat("foo",3);

Bottom line: programmable shaders really are pretty easy to use. I personally find them easier to write than JavaScript, especially for vector arithmetic.

Further Info

See also the GLSL cheat sheet (especially for builtin variables).

The official GLSL Language Specification isn't too bad--chapter 7 lists the builtin variables, chapter 8 the builtin functions. OpenGL ES / GL 3.0 is similar, but they deprecated a bunch of the builtin variables from fixed-function GL.