Optimization: Speed of Operations

CS 301 Lecture, Dr. Lawlor, 2005/10/31

The simplest possible performance model is the constant model:
    t = C
That is, the time t taken to do something is always a constant, C.

The next-simplest model is the linear model:
    t(n) = A + B n
For example, the time t(n) taken to execute most loops is just the loop startup time, A, plus some fixed time B for each of the n iterations of the loop.  To find A, we can just run a loop with no iterations.  To find B, we can run the loop many times (big n), and then B = (t(n)-A)/n.

Here's an example where we use the netrun builtin timing routine "time_function" to experimentally determine A and B:
(Executable NetRun Link)
int rep=0; /* Number of repetitions for do_it to execute*/
int do_it(void) {
unsigned int i, max=rep, sum=0;
for (i=0;i<max;i++) sum=sum*(i+1);
return (int)sum;
}

int foo(void) {
rep=0;
double A=time_function(do_it); /* time to do no iterations */

int n=100000;
rep=n;
double B=(time_function(do_it)-A)/n; /* time per iteration */

printf("Per call: %.2f ns. Per iteration: %.2f ns\n",
A*1.0e9, B*1.0e9);
return 0;
}
We can now do anything we want inside do_it, and see how fast it is.  Here are the results for the default NetRun x86 machine, a 2.8GHz (0.36ns/clock) Pentium 4.  The loop startup time A is always about 5 ns.
Time per iteration B
Operations:
0.8ns
Add, Subtract, AND, OR, XOR, NOT, bitshift (by a constant)
1.0ns
bitshift by a non-constant: sum=sum>>i;
3.6ns
Multiply: sum=sum*(i+1);
24ns
Divide: sum=sum/(i+1);
about 100 ns
pow, log, exp, sqrt, sin, cos, malloc/free
about 200 ns
fprintf (to a file; printing to the screen is >10x slower)
thousands of ns
fopen/fclose, network connections,