Newtonian Particles on the GPU: P and V textures
CS 482
Lecture, Dr. Lawlor
The basic idea for simulating
Newtonian particles on the GPU is to keep the position P and
velocity V in textures, and compute their changes in pixel
shaders.
The big challenge is in a 3D simulation, P is XYZ, and V is XYZ, so
we have a total of 6 floats, but a shader's output can only be 4
floats. Possible solutions include:
- Use separate textures for P and V. This means writing
separate shaders that output P and V separately. Luckily,
P+=dt*V and V+=dt*A read and write mostly separate
data. In general, P=fnP(oldP,oldV), and V=fnV(oldP,oldV),
so this theoretically scales to any number of outputs, although
it might not be very efficient if you need to recalculate stuff
(e.g., collision detection) in both shaders.
- Use the same texture for P and V, but interleave the pixels
somehow, for example by making even X coordinates represent P,
while odd X coordinates represent V. This means your
shader actually runs twice for each particle, once to output the
P pixel, and a second time to output the V pixel, and because
these probably don't share much functionality, the shader
probably starts with one big branch and is effectively two
shaders welded together. Because of GPU branch
granularity, it's probably best if P and V are separated by at
least a few pixels, the more the better (e.g., left half of
texture is P, right half is V). This has fewer passes than
fully separated P and V, but has little else to recommend it.
- Use a single pixel to merge P and V. This is trivial in
2D: vec2 P=tex.xy; vec2 V=tex.zw;. In 3D in general this
won't work, but often particles are trapped in a 2D+height
surface anyway, so you can recompute Z where you need it.
I've seen people do very strange things, like pack P.x and P.y
into a single float using scaling and mod commands, but to do
this the high coordinate needs to be OK with truncated
precision.
- There's a special OpenGL state command glDrawBuffers
that enables a single shader to output pixels for several
textures at once, for example writing to gl_FragData[2] will
write to the framebuffer object's GL_COLOR_ATTACHMENT2 texture,
but you do need to set up quite a bit of OpenGL state
beforehand. There is still a hardware limit to how many
textures you can write from a single shader
GL_MAX_COLOR_ATTACHMENTS, typically 4 or 8 depending on your
OpenGL hardware and drivers. OpenGL ES implementations
such as WebGL only support one color attachment.
OK, now we can compute positions and velocities. For newtonian
mechanics, we can either have a separate force texture, or compute
the force as a local variable inside the velocity shader.
Typically it's more efficient, and less code and hassle, to
recompute things rather than store and load them later.