Hi, libffm and BLAS users,

I made a libffm patch to add exceptional handling and some optimized

BLAS routines.

1. exceptional handling patch for libffm-0.21.

Now libffm's current version is 0.21, but this routine can not

handle exceptional value(NaN, +-Inf, Subnormal). This patch enables

to handle such a value. For examble, sqrt() routine can calculate

subnormal value exactly(it's not emulation, so it can calculate

a little bit slower than normal value).

Most routines are as fast as before, but exp() routine is a little

bit slower.

## atan/asin/acos routine are not finished yet. ###

Also, I added some useful(maybe??) routines.

sqrti : calculate 1.0/sqrt(), it's fast.

sqrtv/sqrtiv : vectorlized sqrt/sqrti routine. This routine can

calculate only 17 clocks/factor.

At now, I made only double float version for C. If you

want single float version or for FORTRAN version, please

let me know.

ATTENTION!

This patch is for TEST ONLY, so you can not attach your pacakges.

Please wait until next public release.

See at

ftp://ftp.eni.co.jp/.2/Linux-Alpha-JP/ftp.statabo.rim.or.jp/libffm

2. optimized BLAS routine.

Some optimized BLAS routines are available(these values are at 21164

600MHz LX with 2MB L3 cache machine).

sgemm/dgemm : 960/820 MFlops constantly.

Can you hear "Alpha resonance"? I do not know why,

but I can hear a kind of resonance from 21164.

sgemv/dgemv : if the data is in cache, it runs about 700MFlops.

sdot, ddot, dsdot, zdotu, zdotc, cdotu, cdotc :

pretty fast, but I do not know exact value

(maybe 650 to 700 MFlops).

saxpy/daxpy : joke :-)

See at

ftp://ftp.eni.co.jp/.2/Linux-Alpha-JP/ftp.statabo.rim.or.jp/BLAS

Now, I'm trying to make caxpy/daxpy, cgemv, zgemv routine. It'll be

available until next week.

Thanks,