CS 301 - Homework 8
- Change the "bar" subroutine (which evaluates a quadratic
polynomial on the array "f") so it takes less than 1.0ns per
float. Don't change anything else. Don't worry about
roundoff or when the array isn't a multiple of 4 in length.
You'll probably have to use x86 SSE instructions.
(Executable NetRun Link)
enum {n=1000};
float f[n+1];
float a=0.2,b=0.3,c=0.4;
int bar(void) { // YOU MAY ONLY CHANGE THE BAR ROUTINE!
for (int i=0;i<n;i++)
f[i]=(a*f[i]+b)*f[i]+c;
return 0;
}
int foo(void) {
printf("bar: %.2f ns/float\n",time_function(bar)/n*1.0e9);
farray_fill(f,n,1.0); bar();
return farray_checksum(f,n,1.0);
}
- Your buddy wrote this x86 SSE code, but it's now segfaulting. Change "bar" to do the same thing, but without segfaulting.
(Executable NetRun Link)
#include <xmmintrin.h>
enum {n=1000};
float f[n+1];
int bar(void) { // YOU MAY ONLY CHANGE THE BAR ROUTINE!
__m128 v=_mm_load_ps(&f[22]);
v=_mm_mul_ps(v,v);
float ret;
_mm_store_ss(&ret,v);
return (int)ret;
}
int foo(void) {
farray_fill(f,n,1.0); bar();
return bar();
}
- You're working for the huge game company Ego(tm). Ego, close to
bankruptcy, just released Earth XXVI (the latest version of their most
popular game) on a very tight production schedule. The schedule
was so tight that due to various miscommunications most of the 90GB of
game art is horribly screwed up. Rather than recall and reprint
all 18 game DVDs, Ego(tm) wants to release a small patch to correct the
game art on-the-fly at display time. Your job is to write
ARB_fragment_program code to correct the art. For example,
texture[3] is almost right, except the colors are all backwards--each
color component has values from 1 to 0, instead of 0 to 1. You
can see the texture with this small OpenGL program: (Executable NetRun Link)
TEX out,in,texture[3],2D;
Change this fragment program so the art is displayed properly, with
colors from 0 to 1 (black text on a white background)--i.e., where the
texture contains color c, you should display color 1.0-c. The photo of the demon, er, professor should then look like an ordinary photo.
- texture[4] was defaced by vandals, who broke into Ego's servers
just before the release. Luckily, they only destroyed the red and
blue channels--the green channel is still OK. Copy the texture's
green channel out to all the other channels--the result will be
greyscale, but it won't be as embarassing as the vandalized version.
- texture[5] is the result of a malfunction in Ego's "rot-20" Content
Protection System, used to prevent crackers, pirates, and terrorists
from stealing Ego's Intellectual Property(tm). The system works
by cyclically rotating each line by a fixed amount. The company
cryptographers say this malfunction can be compensated for by adjusting
the input texture coordinates, changing coordinates (x,y) to
coordinates (x+20.0*y,y)--OpenGL's texture repeat will automatically
take care of the rest. If this works, you should see the actual
end-game image instead of weird noise.
For problems 1 and 2, read the SSE and SIMD lecture notes. You shouldn't need anything other than loads and stores and basic arithmetic.
For problems 3-5, read the graphics card lecture notes and see the ARB_fragment_program cheat sheet.
You shouldn't need anything other than the TEX call above, ADD, MUL,
MOV, swizzling, and writemasks. You'll know you're done when you
have a white background, black text listing the problem number, and a
reasonable-looking photo.
As usual, you'll turn these problem in by just naming them HW8_1, HW8_2, HW8_3, etc. in NetRun.
Problems are due Thursday, December 8, at NOON--I'm tired of NetRun crashing when I'm not there to fix it.
O. Lawlor, ffosl@uaf.edu
Up to: Class Site, CS, UAF