| CS 381 Fall 2012 > Lecture Notes for Thursday, September 20, 2012 |
OpenGL stores a transformation as a \(4\times 4\) matrix. In the Vertex Processing section of the pipeline, a vertex is stored as a 4-D vector.
Note: OpenGL does not specify implementation details, only an interface. Thus, when we say that OpenGL “stores” data in a certain way, what we really mean is that it acts like the data are stored in that way. Internal data formats might be different, in order to increase efficiency.
A transformation is applied to a vector by multiplying the matrix by the vector (plus a division operation, which we will discuss later). To make a single matrix representing one transformation performed after another (e.g., a rotation followed by a translation), multiply the two matrices representing the original transformations.
For each of OpenGL’s internal transformations
(model/view, projection, etc.),
there is a stack of matrices.
The top of the stack represents the current transformation.
We have already seen this in operation:
recall the
glPushMatrix and glPopMatrix
commands.
It is probably not clear yet why we would want to represent a 3-D point by a 4-D vector. So, for the moment, we will pretend that OpenGL represents a vertex using a 3-D vector, and transformations are stored as \(3\times 3\) matrices.
Here is the matrix that would be produced by
glScaled(a, b, c) in our pretend world:
\[ \begin{pmatrix} a&0&0\\ 0&b&0\\ 0&0&c \end{pmatrix}. \]
We can check that this does the right thing:
\[ \begin{pmatrix} a&0&0\\ 0&b&0\\ 0&0&c \end{pmatrix} \, \begin{pmatrix} x\\ y\\ z \end{pmatrix} = \begin{pmatrix} ax\\ by\\ cz \end{pmatrix}. \]
Similarly, we can do rotation with a \(3\times 3\) matrix.
We will not cover the formulas in full generality,
but here is how to do a rotation about the \(z\)-axis.
The matrix that would be produced by
glRotated(t, 0.,0.,1.),
in our pretend world,
is the following (with \(t\) being in degrees, of course):
\[ \begin{pmatrix} \cos\,t&-\sin\,t&0\\ \sin\,t&\cos\,t&0\\ 0&0&1 \end{pmatrix}. \]
Again, we can check that this does the right thing.
If we try to do a translation with a \(3\times 3\) matrix,
we find that it cannot be done.
In particular,
glTranslated(a, b, c)
should send the point \((0,0,0)\)
to \((a,b,c)\).
However, multiplying a \(3\times 3\)
matrix by the zero vector,
always results in the zero vector,
not the vector we want.
We solve this problem by moving to 4-D. We represent the point \((x,y,z)\) by the vector \(\langle x,y,z,1 \rangle\). And we represent a transformation by a \(4\times 4\) matrix.
Here is the (actual!) matrix produced by
glTranslated(a, b, c):
\[ \begin{pmatrix} 1&0&0&a\\ 0&1&0&b\\ 0&0&1&c\\ 0&0&0&1 \end{pmatrix}. \]
It works, too:
\[ \begin{pmatrix} 1&0&0&a\\ 0&1&0&b\\ 0&0&1&c\\ 0&0&0&1 \end{pmatrix} \, \begin{pmatrix} x\\ y\\ z\\ 1 \end{pmatrix} = \begin{pmatrix} x+a\\ y+b\\ z+c\\ 1 \end{pmatrix}. \]
Any transformation previously represented by a \(3\times 3\)
matrix can now be represented by a \(4\times 4\):
start with the old matrix,
add a fourth column of three zeroes,
then add a fourth row containing \(0\), \(0\), \(0\), \(1\).
For example, here is the actual matrix generated
when we do glScaled(a, b, c):
\[ \begin{pmatrix} a&0&0&0\\ 0&b&0&0\\ 0&0&c&0\\ 0&0&0&1 \end{pmatrix}. \]
See
showmat.cpp
for code that displays the current model/view matrix.
We do a perspective projection based on the synthetic-camera model. Place the camera into the world at the origin, facing in the \(-z\) direction. Put the viewport in front of the camera at distance \(c\). Then the viewport lies at \(z=-c\).
Consider a point \((x,y,z)\) in the scene.
We want to find its projection:
a point \((a,b,-c)\)
that lies on the line from \((x,y,z)\)
to the origin.
(To simplify our computations,
suppose that \(x, y \gt 0\) and \(z \lt 0\).)
Because the two right triangles are similar, we know that \(b/c = y/(-z)\), and so \(b = -cy/z\). Computations for \(a\) are along the same lines; we obtain \(a = -cx/z\).
We cannot divide using matrix multiplication; the only operations involved are addition and multiplication. Thus, we cannot use matrix multiplication alone to find \(-cy/z\) and \(-cx/z\). We need to include a division operation in the computations performed in the pipeline.
Here is how it is done.
When we do glVertex3d(x, y, z),
this point actually goes through the pipeline
as a 4-D vector:
\(\langle kx, ky, kz, k \rangle\),
for some nonzero number \(k\).
This is called the homogeneous form
of the 3-D vector
\(\langle x, y, z \rangle\).
When the vertex enters the pipeline,
we (that is, the people who implement the pipeline)
get to pick any nonzero value for \(k\)
that we want.
If we have any sense, then we let \(k=1\).
This means that
\(\langle x, y, z \rangle\)
turns into
\(\langle x, y, z, 1 \rangle\),
just as we said before.
(However, if we pick a different value for \(k\)
then all of the transformations we discussed, still work.)
When the vertex leaves the Vertex Processing section of the pipeline, it needs to be convered back into a 3-D vector. But now we do not get to choose the 4th coordinate. So we must divide by whatever the 4th coordinate is. The 4-D vector \(\langle x,y,z,w \rangle\) becomes \(\langle x/w,y/w,z/w \rangle\). This is the 4th-coordinate division. It allows us to do the division necessary for a perspective projection, while not messing up our earlier transformations (rotation, scaling, translation).
Now we consider how to do a perspective projection using matrix-vector multiplication, followed by a 4th-coordinate division. Consider the following matrix:
\[ \begin{pmatrix} -c&0&0&0\\ 0&-c&0&0\\ 0&0&1&-1\\ 0&0&1&0 \end{pmatrix}. \]
If we multiply the above matrix by the vector \(\langle x,y,z,1 \rangle\), then the result is \(\langle -cx,-cy,z-1,z \rangle\). Performing the 4th-coordinate division, we obtain \(\langle -cx/z,-cy/z,(z-1)/z \rangle\).
We see that the \(x\) and \(y\) values in our result are exactly what we want them to be. But the \(z\)-coordinate is a little strange. However, we typically only use the computed \(z\)-coordinate for hidden-surface removal. So all that matters is that our final \(z\)-coordinate gets larger when our original \(z\) gets larger, and smaller when it gets smaller. And, in fact, if \(z<0\), then \((z-1)/z\) does get larger and smaller as \(z\) does.
Note: When the actual matrix for a perspective projection
is generated, a few other things need to be taken into account,
such as the field-of-view angle,
the aspect ratio of the viewport,
and the maximum and minimum \(z\) values
of the objects that we want to appear in our scene.
So, while
the above matrix will do a perspective projection,
it might not be the actual matrix generated by any particular
OpenGL call.
See the documentation for glFrustum
and gluPerspective
for the actual matrices.
Now we can put together a complete description of how OpenGL handles transformations.
glLoadIdentity sets the matrix for
the current transformation to be the \(4\times 4\)
identity matrix.
glTranslate*,
glRotate*,
glScale*,
glOrtho,
etc.)
generates a matrix,
and then multiplies this by the current transformation matrix,
on the right.
That is,
if \(A\) is the current transformation matrix,
and \(N\) is the matrix generated by the command,
then the new transformation matrix is \(A\,N\).
Hidden Surface Removal (HSR) means ensuring that objects hidden behind other objects are not drawn. An object that is fully or partially hidden behind another, is said to be occluded.
A generalization involves translucent objects, in which one can see one object through another.
We categorize HSR methods based on where they fit into the rendering pipeline—or whether they fit into the pipeline at all. Thus, we have three kinds of HSR methods.
We will look briefly at the first two categories. Then we will talk about OpenGL’s built-in HSR. For non-pipeline methods, see CS 481.
Choice of method is affected by the following.
The rendering pipeline, being a pipeline, can have multiple pieces of data in it at once. For example, when a primitive moves from Vertex Processing to Rasterization, then another primitive can be moved into Vertex Processing. Since the sections of the pipeline are operating in parallel, the slowest section is what determines how fast our rendering goes. We say the rendering speed is bound by the performance of the slow section. Thus, a running rendering program can generally be placed in one of two categories.
So, for example, if a program is fill-bound, then speeding up the Vertex Processing will not increase our frame rate at all. Rather, the Vertex Processing section of the pipeline will just spend more time waiting for Fragment Processing. On the other hand, moving some work from Fragment Processing to Vertex Processing could be a good idea; one way to do this might be to switch from a pixel-based HSR method to a geometry-based method.
Geometry-based HSR methods do HSR at the drawing-primitive level. Thus, generally, they handle polygons.
If we have a list of the polygons in a scene, sorted by depth, then we can render the scene in two different ways.
Now, how do we sort polygons? The most straightforward method is to find the barycenter of each polygon, and sort by the depth of the barycenter. The barycenter of a triangle is computed by finding the vector sum of the vertices and dividing by 3.
This sorting method works well much of the time. It generally gives good results on surfaces composed of fine polygonal meshes. However, a simple Depth Sort does not work all the time (to emphasize: a simple Depth Sort can give incorrect results).
If fact, doing HSR correctly simply by sorting polygons,
is impossible for some scenes.
For example, we can have three polygons
\(P\), \(Q\), \(R\), with
\(P\) occluding \(Q\),
\(Q\) occluding \(R\), and
\(R\) occluding \(P\).
There is no way to sort such a list properly.
Solutions to this involve splitting polygons. Here is a method that does always result in correctly rendered scenes.
Newell’s Algorithm is based on a simple Depth Sort; however, it correct results for all scenes composed of planar polygons. Since a polygon is planar, we can talk about the “plane of the polygon”, meaning the plane that the polygon lies in.
Note that the \(x\)-coordinates used by points in a polygon, form an interval. We call this the polygon’s \(x\)-interval, and similarly \(y\)-interval and \(z\)-interval.
Newell’s Algorithm adds extra tests to a Depth Sort. As part of a sort, we compare two polygons \(P\), \(Q\). We ask a series of questions about them, in order of computational difficulty. if the answer to any question is “yes”, then we know what to do; otherwise, we proceed to the next question.
If we make it through all of the above questions with “no”
answers, then we split
one polygon along the plane of the other.
We remove the split polygon from the list of primitives,
and add the two pieces to the list.
Once we have a depth-sorted list of polygons, we render using the Painter’s Algorithm or the Reverse Painter’s Algorithm, as before.
Advantages of Newell’s Algorithm:
Disadvantages:
The last disadvantage above can be dealt with by using special data structures to hold depth relationships in a scene (look up “BSP Tree”).
Pixel-based HSR methods do HSR at the pixel level. If such a method is used with the rendering pipeline, then it will be in the Fragment Processing section.
A \(z\)-buffer is a 2-D array that holds \(z\) values, one for each pixel in the framebuffer.
To use a \(z\)-buffer for HSR, we begin by set all \(z\) values to the maximum possible distance. As each fragment exits the pipeline, compare its \(z\) value with the one in the buffer, and discard the fragment if it is not closer. If a fragment is not discarded, then write its color to the framebuffer and its \(z\) value to the \(z\)-buffer.
Recall that the \(z\)-coordinate of vertices exiting the Vertex Processing section of the pipeline is generally not an actual \(z\)-coordinate in our world. Rather, it is a value that gets larger as the actual \(z\)-coordinate of the object gets larger, and smaller as it gets smaller. This is all we need, for \(z\)-buffering to work. However, it is true that, for very far away \(z\) values, the \(z\)-coordinate of vertices exiting the pipeline, changes very little as the actual \(z\)-coordinate of the object changes. Thus, \(z\)-buffering can give incorrect results for very distant objects that occlude other very distant objects.
Advantages of \(z\)-buffering:
Disadvantages:
OpenGL includes easy-to-use facilities for \(z\)-buffering. In GL-speak, the \(z\)-buffer is called a depth buffer, since it holds depth values. The comparison of depth values is called the depth test; it occurs in the Fragment Processing section of the pipeline. When the depth test is enabled, each fragment’s depth is compared with the depth stored at the corresponding pixel location. The depth test fails if the fragment depth is greater than or equal to the stored depth; in that case, the fragment is discarded. The depth test passes if the fragment depth is less than the stored depth; in that case, the fragment’s color is stored in the framebuffer (color buffer in GL-speak) and the fragment’s depth is stored in the depth buffer.
Here is how to use OpenGL’s depth buffer in a GLUT program.
GLUT_DEPTH
into your glutInitDisplayMode call.
glutInitDisplayMode(GLUT_DOUBLE | GLUT_RGB | GLUT_DEPTH);
glEnable.
You can also disable the depth test withglEnable(GL_DEPTH_TEST);
glDisable(GL_DEPTH_TEST).
For example, you might want to disable the test
before rendering documentation text in the graphics window.
You can reenable the test as above.GL_DEPTH_BUFFER_BIT to your glClear call.
The various constants in your glClear call
should be separated by bitwise-or operators
(“|”),
just as with glutInitDisplayMode.
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
ggchappell@alaska.edu