Hi,
Kazushige Goto and myself, we are now releasing version 0.21 of free
fast math
routines to eventually replace libm. These routines are based on work I
did
years ago for another RISC CPU, but those now use different (*better*
!)
approximations and some other ideas that grew in the meantime. Kazushige
Goto
again did a really great job in optimizing the assembler code,
"vectorizing"
polynomial evaluation code and improving instruction scheduling to get
the
code so fast as it is now
!
Some of the routines seem to run even nearly twice as fast as the
already
fast "Cray" routine I ported some weeks ago, at comparable
accuracy.
This version 0.21 of libffm still does not yet contain checking for
invalid
arguments. But this is planned for the next release, as also sinh,
cosh,
tanh, and a full precision pow function, besides other small
utilities
functions like fmod, ldexp, frexp and all the like (time permitting
!).
Support for ECOFF and profiling has already been
added.
Further modification could be (besides the argument checking) the
fine-tuning
of the last bits of the constants used and of the order of evaluation
to
minimize or compensate the effect of rounding
errors.
Copyright is now changed to GNU Library GPL, as
requested.
The routines can be downloaded
at
http://people.frankfurt.netsurf.de/Joachim.Wesner/libffm.0.21.tar.gz
See file README for further instructions and
details.
Approximate running times in us in a tight loop for random arguments
0..10
(0..1 for asin/acos) on a 533MHz LX 21164 (n =
63).
libffm libfm
libm
sin 0.14 0.21
0.43
cos 0.15 0.21
0.43
tan 0.15 0.27
0.54
cotan 0.15 ----
----
asin 0.21 ----
1.33
acos 0.21 ----
1.27
atan 0.19 ----
0.58
atan2 0.20 ----
0.72
log2 0.17 ----
----
log 0.17 0.16
0.44
log10 0.17 0.22
0.53
exp2 0.10 ----
----
exp 0.14 0.17
0.50
exp10 0.14 ----
----
powr(x,n) 0.10 0.35 1.67
(pow)
powr(x,y) 0.35 0.35 1.67
(pow)
sqrt 0.13 0.13
0.19
Joachim