Here is the "best of" HW#1 solutions and a guide to what you need to do for HW#4.

Some general comments

Specific Example - 300MHz Mips R12000 4 runs averaged together - 58.78ns (wall clock) and 35.44ns (user time) for calculating the square root with sqrt().

Notice the jump from 4 to 5 digits. Is 5 "just a little on the high side" and would 6 be back following the slope of 1-4? This tells you to try 6, 7, maybe even 8, until you know what happens. Multiple runs at each precision would also give us a better idea of the "curve" - is there really a 4x difference between wall-clock and user-time between 2 and 3 digits of precision?

With a sample size of 1 at each point, it looks like LUT is 2x as fast sqrt() for 1-4 digits of precision. We have a crossing near 5 and a dramatic change (as we should have predicted) for 6 digits of precision. To conclude this with more certainty, we need to take several samples at each data point and calculate a confidence interval.

A different CPU (AMD) and I would say this doesn't have hardware support for sqrt(). Because of this, LUT is 3-4 times faster than sqrt() for 1-4 digits of precision. I'd say 5 is still 2-3 times faster, but without an idea of measurement error it could be random error. Again, confidence bands (plot points c1 and c2 for each data point and draw lines connecting the c1s and c2s) would give us a better idea.