: Here is a very fast integer square root algorithm written by Lawrence Kirby:

: static unsigned fred_sqrt(unsigned long x)

: {

: static const unsigned char sqq_table[] = {

[ snipped for space, see references or email me ]

I was doubtful that an algorithm with a table lookup

would work all that well on modern memory-bound CPUs, so

I wrote a "test harness" to compare it against the "vanilla"

algorithm:

#define N_BITS 32

#define MAX_BIT ((N_BITS+1)/2-1)

unsigned int isqrt(unsigned long x)

{

register unsigned long xroot, m2, x2;

xroot = 0;

m2 = 1L<<(MAX_BIT * 2); /* Start looking at MSB of input register */

do {

x2 = xroot + m2; /* Form a "guess" from current root and next bit */

xroot >>= 1; /* Pre-shift to make add below neater */

if(x2 <= x) { /* If our guess is still low,... */

x -= x2; /* Reduce input by root-so-far */

xroot += m2; /* Add the bit that succeded to root */

}

} while(m2 >>= 2); /* Shift "mask" down until it disappears */

if(xroot < x) ++xroot;

return(xroot);

Quote:}

To my surprise (and temporary delight), fred_sqrt is about

50% faster on a PPro (gcc -O2), and 3X faster on an SGI Indy

(MIPS CC -O2). Then that "little voice" said: "But does it get

the right answer?" And the answer is "Only if you consider

truncating to be the right answer". On a test of 100000

random (from rand(), seeded by srand(42)) numbers, fred_sqrt

returned a lower answer than isqrt almost half the time, and

in every case, squaring isqrt's answer produced a value closer

to the original number than squaring fred_sqrt's answer.

FYI. "Trust, but verify" :-)

Mike