Hi, this is an annoucement of BLAS and new LibFFM routine for Alpha.
This a optimized BLAS library for alpha including ...
1. Level 1, Level 2, Level 3
2. Some extended Level 1
3. Compaq's extented routine(GEMA, GEMS, GEMT)
4. Some Lapack routine(LASWP, GETF2, GETRF, GETRS)
Level 1, GEMV, GER, GEMM(Level 3) routines are written in assembler.
Especially, Level 3 GEMM routine performs near theoretical peak
performance(SGEMM : 94.5% of Peak, DGEMM : 92.5% of Peak, 667MHz
21264 with DDR cache). Also small matrix performance is much better
than before(faster than ATLAS and just unrolled assembler routine).
Other features ...
1. Supported SMP systems. The Linpack peak performance of
4 CPU is 4430 MFlops (83% of peak).
2. works on Linux/Alpha and Tru64 UNIX. I does not make sure
if it works on *BSD.
I've just re-started to develop Free Fast Math libraries. The target
is "faster and more accurate". I think it's really difficult, but
I'll try it. Anyway, I've made new SIN/COS/TAN routines. The SIN/COS
routines are much accurate and faster than before(I will use
polynomial functions, not table algorithms). TAN routine is slightly
# I don't know anyone who wants a new libFFM. I will also make
# vectorlized routine which is compatible for Compaq's VLIB(CXML).
The sources are available at