One of the biggest reasons to even think about computer architecture or assembly language is performance: if you understand what the machine is doing, it's a lot easier to do it faster.
Speed
is in the eye of the beholder--in this case, the user. But
there's a big disconnect between computer speed and human speed: Time
Seconds Examples 31
years Gs 10e9 Complete
life cycle of a successful multi-person software
project. North
America drifts a few feet. 2
weeks Ms Write,
debug, and test a simple single-person program. Set
up billing for a commercial job. 15
minutes ks 10e3 Sysadmin
arrives to reboot server. Write
trivial script or one-off program. Shadow
of a 30 foot tall building travels one foot. second s 1 Humans can respond
to input. millisecond ms 10e-3 Send
data across a fast network. Start
seek on fast hard disk. Bullet
travels about 1 foot. microsecond us or μs 10e-6 Print
to the screen (printf or cout). Call
the operating system. Blink
an LED. nanosecond ns 10e-9 Execute
a few instructions. Call
a function. 1
clock cycle, at 1GHz clock rate. Light
travels about 1 foot.
10e6
Because computers are so fast, and humans are so slow:
The
simplest tool for figuring out what's slow is a timer: a
function that returns the current real time ("wall-clock"
time). There are much more sophisticated tools, like
a
performance profiler, which samples the execution of the
program to see what's running often.
It's a really bad idea to try to optimize code without being able to measure its performance. You really need to be scientific when running performance experiments: at least half of the "optimizations" I try don't help measurably. Some actually slow the program down!
There are TONS of ways to get some notion of time in your programs:
#include <chrono> double get_time() { return 1.0e-9*std::chrono::duration_cast<std::chrono::nanoseconds>( std::chrono::high_resolution_clock::now().time_since_epoch() ).count(); } double foo() { double t1=get_time(); double t2=get_time(); return t2-t1; }
NetRun
has a builtin function called "time_in_seconds", which returns a
double giving the number of seconds. Here's the
implementation (from project/lib/inc.c):
/** Return the current time in seconds (since something or other). */
#if defined(WIN32)
# include <sys/timeb.h>
double time_in_seconds(void) { /* This seems to give terrible resolution (60ms!) */
struct _timeb t;
_ftime(&t);
return t.millitm*1.0e-3+t.time*1.0;
}
#else /* UNIX or other system */
# include <sys/time.h> //For gettimeofday time implementation
double time_in_seconds(void) { /* On Linux, this is microsecond-accurate. */
struct timeval tv;
gettimeofday(&tv,NULL);
return tv.tv_usec*1.0e-6+tv.tv_sec*1.0;
}
#endif
As
usual, there's one version for Windows, and a different version
for everything else.
#include <fstream>
int foo(void) {
double t_before=time_in_seconds();
std::ofstream f("foo.dat"); /* code to be timed goes here */
double t_after=time_in_seconds();
double elapsed=t_after - t_before;
std::cout<<"That took "<<elapsed<<" seconds\n";
return 0;
}
The problems are that:
Problems 1 and 2 can be cured by running many iterations--"scaling up" many copies of the problem until it registers on your (noisy, quantized) timer. So instead of doing this, which incorrectly claims "x+=y" takes 700ns:
int x=3, y=2;
double t1=time_in_seconds();
x+=y;
double t2=time_in_seconds();
std::cout<<(t2-t1)*1.0e9<<" ns\n";
return x;
You'd do this, which shows the real time, 0.5ns!
int x=3, y=2, n=1000000;
double t1=time_in_seconds();
for (int i=0;i<n;i++) x+=y;
double t2=time_in_seconds();
std::cout<<(t2-t1)/n*1.0e9<<" ns\n";
return x;
(If you get 0.0 nanoseconds, the compiler has converted the loop into x+=n*y. Add "volatile" to the variable declarations to scare it away.)
Problem 3 can be cured by running the above several times, and throwing out the anomalously high values (which were probably interrupted).
Your
other main tool for performance analysis is a
"profiler". This is a library that keeps track of
which function is running, and totals up the number of function
calls and time spent in each one. The "Profile" checkbox
in NetRun will run a profiler, and show you the top functions it
found in your code.
Both
timing and profiling have some problems, though:
NetRun
has a nice little built-in function called print_time that takes
a string and a function. It times the execution of (many
iterations of) the function, then divides by the number of
iterations to print out exactly how long each call to the
function takes. The "Time" checkbox in NetRun just calls
print_time on the default foo function.
To
use the NetRun timer functionality on your own code, hit
"Download this file as a .tar archive". My timer stuff is
in project/lib/inc.c