Simulations on the GPU
CS 482 Lecture,
Dr. Lawlor
Modern OpenGL talks to the graphics card using a variety of
specialized software objects:
- A texture
stores an image, as a regular grid of pixels. Mostly these
are 2D, although 3D textures also exist. Textures are
stored on the graphics card, in graphics card memory, so they're
incredibly fast to read and write. Texture pixels can
either be read from a file or uploaded from JavaScript (using gl.texImage2D),
or computed on the graphics card directly. To run on the
graphics card, most simulations will need to store their
simulation data in textures, probably using the four RGBA fields
stored in gl.FLOAT format. Caution: if you're used to the
CPU, textures have a number of serious limitations; for example,
the size of a texture is limited to
gl.getParameter(gl.MAX_TEXTURE_SIZE), which can be as low as
8192 pixels (8192 x 8192 pixels is 64 million pixels, 256
million floats, or only 1 gigabyte).
- A framebuffer
object lets you render into a texture, the same way you'd
render pixels onto the screen. If texture pixels are
computed at runtime, this is how you do it.
- A GLSL
shader (go read it!) consists of code that looks a whole
lot like C++, but it gets compiled at runtime so it can run on
your graphics card.
- Vertex shaders compute the onscreen position (gl_Position)
of the geometry vertex you're drawing. It can also
compute "varying" values that are interpolated to the pixels,
for use in the fragment shader.
- Fragment shaders compute the onscreen color (gl_FragColor)
of the pixel you're drawing. Today, all the fun stuff
usually happens in the fragment shader.
The big limitation with a fragment shader is it can only write to
the pixel you're currently rendering. People deal with this
by:
- Cram everything you need to write into your single RGBA color
pixel. Any simulation handling up to four floats is
trivial. I've seen really weird crazy stuff with packing
multiple values into a single color, this is easiest with
low-bit values on integer pixel colors, but can be done with
floats as long as you're careful about the exponent.
- Take multiple passes to compute what you need in several
stages. For example, you can't write FEM strain and node
forces in a single pass, so you do one pass to compute strains,
then do a second pass to compute forces from the strains.
- Use the WEBGL_draw_buffers
extension to attach several different textures to the
framebuffer, and write to gl_FragColor[i]--it's still just one
pixel, but it can now have more than 4 floats. Some
machines require
all the textures to have the same pixel format, but this
is usually just 4 floats per pixel anyway.
MAX_DRAW_BUFFERS_WEBGL gives the number of textures supported,
which can be as low as four textures per pass. Plus the
extension is not supported everywhere yet.
- Using the Image
load/store extension to read and write arbitrary pixels
(support for the extension is quite rare for now).
Programmable
Shaders are Very Simple in Practice
GLSL is one of the
standard languages today for describing the code used to draw
pixels to the screen on modern graphics cards.
There are many opportunities to use GLSL in simulations.
First, we can get much higher performance by running our
simulation code in a pixel shader, which allows all the pixels
to execute in parallel. The downside is we need to make
all the inputs and outputs shader friendly, which can be
trivial, or very difficult, depending on the simulation
requirements.
Second, we can provide better display of our simulation's
outputs, by including per-pixel information such as the camera
direction, lighting, contrast, etc.
Data types in GLSL work exactly like in
C/C++/Java/C#. There are some beautiful builtin datatypes:
- float. Works exactly like C/C++/Java/C#.
- vec4. A class with four floats in it, which you can
think of as the XYZW components of a vector, or the RGBA
components of a color. vec4 supports + - * / exactly like
you'd expect. vec4 is the native datatype of the graphics
hardware, so all of these operations are
single-clock-cycle.
- You can get to the first component of a vec4 named "v" as
follows:
- "v.x", treating the vec4 as a spatial position or vector.
- "v.r", treating the vec4 as a color. This is the
same data, the same speed, the same everything as ".x"; it's
basically just a comment or a hint to the human reader that
you're dealing with a color.
- "v[0]", treating the vec4 as an array. Again, it's
the same underlying data.
- You can initialize a vec4 as follows:
- "vec4 v=vec4(0.0);\n", sets all four components to zero.
- "vec4 v=vec4(0.1,0.2,0.3,0.4);\n" sets all four components
independently.
- "vec3 d=vec3(0.1,0.2,0.3);\n"
"vec4 v=vec4(d,0.4);\n"
You can make a 3-vector into a 4-vector by just adding the
missing components.
- The "w" component is used for homogenous
coordinates. It's 1.0 for ordinary position
vectors, and 0.0 for direction or offset vectors. You
care about this when you're deriving a new
projection matrix, but otherwise you usually ignore it.
- A vec4 makes a perfectly good quaternion, although you need
to write all the math since nothing is built in.
- vec3. A class with three floats in it. Doesn't
have a ".w" or ".a" component. Useful for representing
directions (surface normals, light directions, etc) when you
don't want the "w" component messing up your dot products.
- vec2. A class with just two floats. Missing ".z"
or ".b" and ".w" or ".a". Useful for representing 2D
texture coordinates, or complex numbers.
- mat4, mat3, mat2. Matrices that operate on vec4's,
vec3's, and vec2's. See my
caveats on
how to load up the matrix values (the constructor takes
column-major order), or just load them from C++ via a builtin
like gl_ModelViewMatrix.
- "int" is fairly rare for computation (the graphics hardware
usually doesn't have integer math!). Some drivers are very
picky about distinguishing between "2" the integer and "2.0" the
float.
- A variable declared as "varying" gets transmitted from the
vertex shader to the fragment shader. This is the only way to
communicate between your vertex and fragment shaders!
- A variable declared as "uniform" gets passed in from
outside. If the GLSL code says "uniform float foo;",
you:
- In THREE.js, you set foo using code like
"myshader.uniforms.foo.value=3;".
- In Igloo, you call program.uniform("foo",3);
- In BABYLON, a ShaderMaterial
will accept a call like material.setFloat("foo",3);
Bottom line: programmable shaders really are pretty
easy to use. I personally find them easier to write than
JavaScript, especially for vector arithmetic.
Further
Info
See also the GLSL cheat
sheet (especially
for builtin variables).
The official GLSL Language Specification isn't too
bad--chapter 7 lists the builtin variables, chapter 8 the builtin
functions. OpenGL
ES / GL 3.0 is similar, but they deprecated a bunch of the
builtin variables from fixed-function GL.