> > Please look at the program tsin.c on the public FTP site mentioned

> > below in my signature. In the comments section at the top is a

> > table of errors on sin(355.0) for many compilers. That should give

> > you an idea on which compiler and library vendors care about good

> > (nearly correct) results, and which don't know (or don't care) how

> > bad their math libraries are. Many Intel x87 based libraries use

> > the fsin hardware instruction, which for the machines I have results

> > for, gets 7.09 bits wrong, or 135 ULP error.

> This is a bit harsh imho:

> The Pentium FSIN is clearly documented (at least in Peter Tang's papers)

> to be used on arguments that have been reduced to the +/- pi range.

Here are the results I get from testing an Intel Pentium 4.

In the following, FUT means Function Under Test.

Using 80-bit long doubles gets:

Test vector 128: FUT not close enough: +3.440707728423067690000e+17 ulp

error

Input arg=4000c90fdaa22168c234=+3.141592653589793238300e+00

Expected =3fc0c4c6628b80dc1cd1=+1.666748583704175665640e-19

Computed =3fc0c000000000000000=+1.626303258728256651010e-19

Test vector 129: FUT not close enough: +1.376283091369227077000e+18 ulp

error

Input arg=4000c90fdaa22168c235=+3.141592653589793238510e+00

Expected =bfbeece675d1fc8f8cbb=-5.016557612668332023450e-20

Computed =bfbf8000000000000000=-5.421010862427522170040e-20

Using 64-bit doubles gets:

Test vector 64: FUT not close enough: +1.64065729543000000e+11 ulp

error

Input arg=400921fb54442d18=+3.14159265358979312e+00

Expected =3ca1a62633145c07=+1.22464679914735321e-16

Computed =3ca1a60000000000=+1.22460635382237726e-16

Test vector 65: FUT not close enough: +8.20328647710000000e+10 ulp

error

Input arg=400921fb54442d19=+3.14159265358979356e+00

Expected =bcb72cece675d1fd=-3.21624529935327320e-16

Computed =bcb72d0000000000=-3.21628574467824890e-16

Here is the Intel documentation on accuracy of FSIN:

IA-32 Intel? Architecture Software Developer's Manual

Volume 1: Basic Architecture, Order Number 245470-006

PROGRAMMING WITH THE X87 FPU

8.3.10. Transcendental Instruction Accuracy

New transcendental instruction algorithms were incorporated into the

IA-32 architecture beginning with the Pentium processors. These new

algorithms (used in transcendental instructions (FSIN, FCOS, FSINCOS,

FPTAN, FPATAN, F2XM1, FYL2X, and FYL2XP1) allow a higher level of

accuracy than was possible in earlier IA-32 processors and x87 math

coprocessors. The accuracy of these instructions is measured in terms

of units in the last place (ulp). For a given argument x, let f(x) and

F(x) be the correct and computed (approximate) function values,

respectively. The error in ulps is defined to be:

... formula would not cut and paste from PDF file ...

With the Pentium and later IA-32 processors, the worst case error on

transcendental functions is less than 1 ulp when rounding to the

nearest (even) and less than 1.5 ulps when rounding in other

modes. The functions are guaranteed to be monotonic, with respect to

the input operands, throughout the domain supported by the

instruction.

The instructions FYL2X and FYL2XP1 are two operand instructions and

are guaranteed to be within 1 ulp only when y equals 1. When y is not

equal to 1, the maximum ulp error is always within 1.35 ulps in round

to nearest mode. (For the two operand functions, monotonicity was

proved by holding one of the operands constant.)

The only reference I can find to the domain of FSIN is:

8.1.2.2 Condition Code Flags

The FPTAN, FSIN, FCOS, and FSINCOS instructions set the C2 flag to 1

to indicate that the source operand is beyond the allowable range of

2**63 and clear the C2 flag if the source operand is within the

allowable range.

Quote:> Due to the 80-bit extended format, it will stay within 0.5 for double

> arguments of a somewhat larger range, but using on an exact integer

> value that just happens to be _very_ close to N * pi.

The results I am getting do not look like 0.5 ULP for doubles near pi.

The only FPU I know (from personally testing) that gets around 0.5 ULP

accuracy for the full input domain of -2**63 to +2**63 for FSIN, is the

AMD K5, done in 1995, designed by Tom *. I believe that it takes

around 190 bits of pi to do correct argument reduction for those values.

---

Fred J. Tydeman Tydeman Consulting

+1 (775) 287-5904 Vice-chair of J11 (ANSI "C")

Sample C99+FPCE tests: ftp://jump.net/pub/tybor/

Savers sleep well, investors eat well, spenders work forever.