Ported fast Cray libm routines now available !

Ported fast Cray libm routines now available !

Post by Joachim Wesne » Wed, 03 Jun 1998 04:00:00

Hello all,

a first version 0.90 of my port of the (scalar) benchlib routines by
Cray is now available at


It contains instructions to create a static library libfm.a that can be
linked with your executables (put it in /usr/lib). Temporary variables
are now stored on the stack, instead with the other *constants*, so the
software could also be changed into a shareable object in future.

I made many other small fixes to the appearence of the source code as
elimation of commented out code and unused variables etc., besides the
differences in assembler source format. Especially trying to
automatically convert the block data subprograms into appropriate
assembler data statements while interchanging several lines etc., by a
simple patch, seemed too complicated or would have ended in a patch:
"Delete all original lines, insert the following new lines ..."

So in the end I decided to publish my complete final version together
with the orginal sources. Remember, the software is still Copyright by
Cray, see the file reply.cray for a first promising statement of one of
the developers of the code about use of the code.

libfm 0.90 contains the following functions:

double sin(double x);
double cos(double x);
double tan(double x);   /* derived using cossin() below */
struct complex {double re, im;} cossin(double x);
        /* simultaneous cos()/sin() at a running time only slightly
           longer than cos() ! */

double exp(double x);
double log(double x);
double log10(double x); /* derived from log() */
double sqrt(double x);
double sqrti(double x); /* Somewhat faster version of 1./sqrt(x) */

double powr(double x, double y);
        /* Less accurate, but MUCH faster 'quick and dirty' variant of
           pow() that also does no special handling of integer
           If you like it to completely replace pow(), change all
           to powr in file powr.S and recompile */

/* Float precision variants that even squeeze out a bit more running
   However, these routines turn out to be only marginally faster on my
   LX 533, see the file 'timings' */

double exp32(double x);
double log32(double x);
double sqrt32(doubel x);

I hope you will like that code, please send any bug reports etc. to my
email adress.

HOWEVER, better, fully GPL, math routines will probably be also
available in the future on "another track":

Kazushige Goto was able to port my old sincos routine I had mentioned
before on this group to assembler and to further improve it, now running
faster than the above cray routines, even for large arguments (sin() in


I will be on a conference till saturday, I hope dowloading will work OK,
(this is the first time I use my homepage), DON'T crash the server of my
ISP !!

Best wishes,

Joachim Wesner


1. Fast cray libm routines available !?


as I wrote before, somebody pointed out that Cray research makes
available on some ftp-server
pretty cool alpha math routines that would be a nice and much faster (up
to > 3 times) replacement for the routines in our libm, that somehow

I downloaded that stuff and was able during the last weekend to do a
preliminary port to use them on alpha/linux and to proof that indeed
they run great, yes they do (see my previous posting regarding running
times), so there is definitely hope for better math routines under axp

Clearly, several people have now asked my for that patched version.
Indeed, the original routines at are freely downloadable, but they are
still copyright cray research and I don't want to be sued by them in the
end if that stuff appears worldwide with linux and they are not pleased.
I had no time yet to ask them, what they think.

If enough people say that it's OK to nevertheless release that stuff
("collective guilt" !?), I think I could post my preliminary, yet fully
working, version here.

Or would some kind of a patch file circumvent the problem ???
What about first only releasing the libfm.a file (fast math) only, till
the problem is resolved, would that be better ???

What do you all think ???

Joachim Wesner

2. test

3. Fast opmized BLAS(Level 1) routine is available

4. Internet host SMTP server survey

5. float versions of libm-routines?

6. diald for ppc?

7. routine "finite()" not in libm.a

8. Help! newbie....

9. w83627hf fast serial port driver available

10. libm.so.5, needed by /usr/lib/libstdc++.so, conflicts with libm.so.6

11. libm-4.5.26 or libm-5.0.0 - where?

12. Fast memory search routine wanted

13. FAST vector multiply routine