CS 381 Fall 2012  >  Lecture Notes for Thursday, September 20, 2012

CS 381 Fall 2012
Lecture Notes for Thursday, September 20, 2012

The Details of Transformations

Overview

OpenGL stores a transformation as a \(4\times 4\) matrix. In the Vertex Processing section of the pipeline, a vertex is stored as a 4-D vector.

Note: OpenGL does not specify implementation details, only an interface. Thus, when we say that OpenGL “stores” data in a certain way, what we really mean is that it acts like the data are stored in that way. Internal data formats might be different, in order to increase efficiency.

A transformation is applied to a vector by multiplying the matrix by the vector (plus a division operation, which we will discuss later). To make a single matrix representing one transformation performed after another (e.g., a rotation followed by a translation), multiply the two matrices representing the original transformations.

For each of OpenGL’s internal transformations (model/view, projection, etc.), there is a stack of matrices. The top of the stack represents the current transformation. We have already seen this in operation: recall the glPushMatrix and glPopMatrix commands.

Scaling & Rotation

It is probably not clear yet why we would want to represent a 3-D point by a 4-D vector. So, for the moment, we will pretend that OpenGL represents a vertex using a 3-D vector, and transformations are stored as \(3\times 3\) matrices.

Here is the matrix that would be produced by glScaled(a, b, c) in our pretend world:

\[ \begin{pmatrix} a&0&0\\ 0&b&0\\ 0&0&c \end{pmatrix}. \]

We can check that this does the right thing:

\[ \begin{pmatrix} a&0&0\\ 0&b&0\\ 0&0&c \end{pmatrix} \, \begin{pmatrix} x\\ y\\ z \end{pmatrix} = \begin{pmatrix} ax\\ by\\ cz \end{pmatrix}. \]

Similarly, we can do rotation with a \(3\times 3\) matrix. We will not cover the formulas in full generality, but here is how to do a rotation about the \(z\)-axis. The matrix that would be produced by glRotated(t, 0.,0.,1.), in our pretend world, is the following (with \(t\) being in degrees, of course):

\[ \begin{pmatrix} \cos\,t&-\sin\,t&0\\ \sin\,t&\cos\,t&0\\ 0&0&1 \end{pmatrix}. \]

Again, we can check that this does the right thing.

Translations & 4-D

If we try to do a translation with a \(3\times 3\) matrix, we find that it cannot be done. In particular, glTranslated(a, b, c) should send the point \((0,0,0)\) to \((a,b,c)\). However, multiplying a \(3\times 3\) matrix by the zero vector, always results in the zero vector, not the vector we want.

We solve this problem by moving to 4-D. We represent the point \((x,y,z)\) by the vector \(\langle x,y,z,1 \rangle\). And we represent a transformation by a \(4\times 4\) matrix.

Here is the (actual!) matrix produced by glTranslated(a, b, c):

\[ \begin{pmatrix} 1&0&0&a\\ 0&1&0&b\\ 0&0&1&c\\ 0&0&0&1 \end{pmatrix}. \]

It works, too:

\[ \begin{pmatrix} 1&0&0&a\\ 0&1&0&b\\ 0&0&1&c\\ 0&0&0&1 \end{pmatrix} \, \begin{pmatrix} x\\ y\\ z\\ 1 \end{pmatrix} = \begin{pmatrix} x+a\\ y+b\\ z+c\\ 1 \end{pmatrix}. \]

Any transformation previously represented by a \(3\times 3\) matrix can now be represented by a \(4\times 4\): start with the old matrix, add a fourth column of three zeroes, then add a fourth row containing \(0\), \(0\), \(0\), \(1\). For example, here is the actual matrix generated when we do glScaled(a, b, c):

\[ \begin{pmatrix} a&0&0&0\\ 0&b&0&0\\ 0&0&c&0\\ 0&0&0&1 \end{pmatrix}. \]

See showmat.cpp for code that displays the current model/view matrix.

Perspective & Homogeneous Form

We do a perspective projection based on the synthetic-camera model. Place the camera into the world at the origin, facing in the \(-z\) direction. Put the viewport in front of the camera at distance \(c\). Then the viewport lies at \(z=-c\).

Consider a point \((x,y,z)\) in the scene. We want to find its projection: a point \((a,b,-c)\) that lies on the line from \((x,y,z)\) to the origin. (To simplify our computations, suppose that \(x, y \gt 0\) and \(z \lt 0\).)
Image not displayed: Computing coordinates for a perspective projection

Because the two right triangles are similar, we know that \(b/c = y/(-z)\), and so \(b = -cy/z\). Computations for \(a\) are along the same lines; we obtain \(a = -cx/z\).

We cannot divide using matrix multiplication; the only operations involved are addition and multiplication. Thus, we cannot use matrix multiplication alone to find \(-cy/z\) and \(-cx/z\). We need to include a division operation in the computations performed in the pipeline.

Here is how it is done. When we do glVertex3d(x, y, z), this point actually goes through the pipeline as a 4-D vector: \(\langle kx, ky, kz, k \rangle\), for some nonzero number \(k\). This is called the homogeneous form of the 3-D vector \(\langle x, y, z \rangle\). When the vertex enters the pipeline, we (that is, the people who implement the pipeline) get to pick any nonzero value for \(k\) that we want. If we have any sense, then we let \(k=1\). This means that \(\langle x, y, z \rangle\) turns into \(\langle x, y, z, 1 \rangle\), just as we said before. (However, if we pick a different value for \(k\) then all of the transformations we discussed, still work.)

When the vertex leaves the Vertex Processing section of the pipeline, it needs to be convered back into a 3-D vector. But now we do not get to choose the 4th coordinate. So we must divide by whatever the 4th coordinate is. The 4-D vector \(\langle x,y,z,w \rangle\) becomes \(\langle x/w,y/w,z/w \rangle\). This is the 4th-coordinate division. It allows us to do the division necessary for a perspective projection, while not messing up our earlier transformations (rotation, scaling, translation).

Now we consider how to do a perspective projection using matrix-vector multiplication, followed by a 4th-coordinate division. Consider the following matrix:

\[ \begin{pmatrix} -c&0&0&0\\ 0&-c&0&0\\ 0&0&1&-1\\ 0&0&1&0 \end{pmatrix}. \]

If we multiply the above matrix by the vector \(\langle x,y,z,1 \rangle\), then the result is \(\langle -cx,-cy,z-1,z \rangle\). Performing the 4th-coordinate division, we obtain \(\langle -cx/z,-cy/z,(z-1)/z \rangle\).

We see that the \(x\) and \(y\) values in our result are exactly what we want them to be. But the \(z\)-coordinate is a little strange. However, we typically only use the computed \(z\)-coordinate for hidden-surface removal. So all that matters is that our final \(z\)-coordinate gets larger when our original \(z\) gets larger, and smaller when it gets smaller. And, in fact, if \(z<0\), then \((z-1)/z\) does get larger and smaller as \(z\) does.

Note: When the actual matrix for a perspective projection is generated, a few other things need to be taken into account, such as the field-of-view angle, the aspect ratio of the viewport, and the maximum and minimum \(z\) values of the objects that we want to appear in our scene. So, while the above matrix will do a perspective projection, it might not be the actual matrix generated by any particular OpenGL call. See the documentation for glFrustum and gluPerspective for the actual matrices.

The Whole Story

Now we can put together a complete description of how OpenGL handles transformations.

Hidden-Surface Removal

Introduction

Hidden Surface Removal (HSR) means ensuring that objects hidden behind other objects are not drawn. An object that is fully or partially hidden behind another, is said to be occluded.

A generalization involves translucent objects, in which one can see one object through another.

We categorize HSR methods based on where they fit into the rendering pipeline—or whether they fit into the pipeline at all. Thus, we have three kinds of HSR methods.

We will look briefly at the first two categories. Then we will talk about OpenGL’s built-in HSR. For non-pipeline methods, see CS 481.

Choice of method is affected by the following.

Performance Issues

The rendering pipeline, being a pipeline, can have multiple pieces of data in it at once. For example, when a primitive moves from Vertex Processing to Rasterization, then another primitive can be moved into Vertex Processing. Since the sections of the pipeline are operating in parallel, the slowest section is what determines how fast our rendering goes. We say the rendering speed is bound by the performance of the slow section. Thus, a running rendering program can generally be placed in one of two categories.

Geometry-bound.
A geometry-bound program is one in which operations on vertices and geometric primitives (including those done before data enter the pipeline) are what limits the rendering speed.
Fill-bound.
A fill-bound program is one in which operations on fragments and pixels are what limits the rendering speed.

So, for example, if a program is fill-bound, then speeding up the Vertex Processing will not increase our frame rate at all. Rather, the Vertex Processing section of the pipeline will just spend more time waiting for Fragment Processing. On the other hand, moving some work from Fragment Processing to Vertex Processing could be a good idea; one way to do this might be to switch from a pixel-based HSR method to a geometry-based method.

Geometry-Based Methods

Geometry-based HSR methods do HSR at the drawing-primitive level. Thus, generally, they handle polygons.

Depth Sort

If we have a list of the polygons in a scene, sorted by depth, then we can render the scene in two different ways.

Painter’s Algorithm
Draw polygons back-to-front, with new colors overwriting existing ones. Advantages: Generalizes nicely to handling translucency.
Reverse Painter’s Algorithm
Draw polygons front-to-back, with colored pixels never being changed. Advantages: Never writes a pixel more than once. May result in a speed-up in some architectures.

Now, how do we sort polygons? The most straightforward method is to find the barycenter of each polygon, and sort by the depth of the barycenter. The barycenter of a triangle is computed by finding the vector sum of the vertices and dividing by 3.

This sorting method works well much of the time. It generally gives good results on surfaces composed of fine polygonal meshes. However, a simple Depth Sort does not work all the time (to emphasize: a simple Depth Sort can give incorrect results).

If fact, doing HSR correctly simply by sorting polygons, is impossible for some scenes. For example, we can have three polygons \(P\), \(Q\), \(R\), with \(P\) occluding \(Q\), \(Q\) occluding \(R\), and \(R\) occluding \(P\). There is no way to sort such a list properly.
Image not displayed: Scene that cannot be rendered correctly with a simple depth sort

Solutions to this involve splitting polygons. Here is a method that does always result in correctly rendered scenes.

Newell’s Algorithm

Newell’s Algorithm is based on a simple Depth Sort; however, it correct results for all scenes composed of planar polygons. Since a polygon is planar, we can talk about the “plane of the polygon”, meaning the plane that the polygon lies in.

Note that the \(x\)-coordinates used by points in a polygon, form an interval. We call this the polygon’s \(x\)-interval, and similarly \(y\)-interval and \(z\)-interval.

Newell’s Algorithm adds extra tests to a Depth Sort. As part of a sort, we compare two polygons \(P\), \(Q\). We ask a series of questions about them, in order of computational difficulty. if the answer to any question is “yes”, then we know what to do; otherwise, we proceed to the next question.

If we make it through all of the above questions with “no” answers, then we split one polygon along the plane of the other. We remove the split polygon from the list of primitives, and add the two pieces to the list.
Image not displayed: Scene fixed by cutting polygon

Once we have a depth-sorted list of polygons, we render using the Painter’s Algorithm or the Reverse Painter’s Algorithm, as before.

Advantages of Newell’s Algorithm:

Disadvantages:

The last disadvantage above can be dealt with by using special data structures to hold depth relationships in a scene (look up “BSP Tree”).

Pixel-Based Methods

Pixel-based HSR methods do HSR at the pixel level. If such a method is used with the rendering pipeline, then it will be in the Fragment Processing section.

\(z\)-Buffering

A \(z\)-buffer is a 2-D array that holds \(z\) values, one for each pixel in the framebuffer.

To use a \(z\)-buffer for HSR, we begin by set all \(z\) values to the maximum possible distance. As each fragment exits the pipeline, compare its \(z\) value with the one in the buffer, and discard the fragment if it is not closer. If a fragment is not discarded, then write its color to the framebuffer and its \(z\) value to the \(z\)-buffer.

Recall that the \(z\)-coordinate of vertices exiting the Vertex Processing section of the pipeline is generally not an actual \(z\)-coordinate in our world. Rather, it is a value that gets larger as the actual \(z\)-coordinate of the object gets larger, and smaller as it gets smaller. This is all we need, for \(z\)-buffering to work. However, it is true that, for very far away \(z\) values, the \(z\)-coordinate of vertices exiting the pipeline, changes very little as the actual \(z\)-coordinate of the object changes. Thus, \(z\)-buffering can give incorrect results for very distant objects that occlude other very distant objects.

Advantages of \(z\)-buffering:

Disadvantages:

Built-In HSR in OpenGL

OpenGL includes easy-to-use facilities for \(z\)-buffering. In GL-speak, the \(z\)-buffer is called a depth buffer, since it holds depth values. The comparison of depth values is called the depth test; it occurs in the Fragment Processing section of the pipeline. When the depth test is enabled, each fragment’s depth is compared with the depth stored at the corresponding pixel location. The depth test fails if the fragment depth is greater than or equal to the stored depth; in that case, the fragment is discarded. The depth test passes if the fragment depth is less than the stored depth; in that case, the fragment’s color is stored in the framebuffer (color buffer in GL-speak) and the fragment’s depth is stored in the depth buffer.

Here is how to use OpenGL’s depth buffer in a GLUT program.


CS 381 Fall 2012: Lecture Notes for Thursday, September 20, 2012 / Updated: 20 Sep 2012 / Glenn G. Chappell / ggchappell@alaska.edu