Is Pentium bad?

Is Pentium bad?

Post by Ljubinko Kond » Sun, 27 Nov 1994 02:37:09



Hot subject about Pentiums these days is recently discovered
error in floating point unit (article in NYT, Thursday, Nov. 24).
Example published there is as follows:
a=4195835
b=3145727

a-((a/b)*b) = ?

It should be zero, of course.  The result of pentium calculation
is 256!!!
So, I checked that (I use fortran) and resutl is 256 on my
Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
program with O2 optimization (f77 -O2 ...) Result this time is 0!
(as it should be).  
The results are independent of of single/double precision.

Anybody have any idea how is optimization doing this?

 
 
 

Is Pentium bad?

Post by Paul Ba » Sun, 27 Nov 1994 05:49:00




>Hot subject about Pentiums these days is recently discovered
>error in floating point unit (article in NYT, Thursday, Nov. 24).
>Example published there is as follows:
>a=4195835
>b=3145727

>a-((a/b)*b) = ?

>It should be zero, of course.  The result of pentium calculation
>is 256!!!
>So, I checked that (I use fortran) and resutl is 256 on my
>Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
>program with O2 optimization (f77 -O2 ...) Result this time is 0!
>(as it should be).  
>The results are independent of of single/double precision.

>Anybody have any idea how is optimization doing this?

You should post this to comp.sys.intel. I've been following the discussion
there, although I don't qualify to comment on it, and I've seen
no one mention that it worked, EVER, on pentiums with the bug. This
should start a new, interesting, thread.

Thanks for bringing this up.

--
Paul Bash  


 
 
 

Is Pentium bad?

Post by Dan P » Sun, 27 Nov 1994 07:00:35



Quote:>Hot subject about Pentiums these days is recently discovered
>error in floating point unit (article in NYT, Thursday, Nov. 24).
>Example published there is as follows:
>a=4195835
>b=3145727

>a-((a/b)*b) = ?

>It should be zero, of course.  The result of pentium calculation
>is 256!!!
>So, I checked that (I use fortran) and resutl is 256 on my
>Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
>program with O2 optimization (f77 -O2 ...) Result this time is 0!
>(as it should be).  
>The results are independent of of single/double precision.

>Anybody have any idea how is optimization doing this?

How do you know the correct result is 0?  Have you used a pencil and
a sheet of paper to do all the calculations?

If you could figure out the correct result without doing any calculation,
a good optimizer could do the same thing, too.  If you have doubts,
look at the assembly code generated by the compiler:

    ues4:~/tmp 141> cat p5bug.c
    #include <stdio.h>
    main()
    {
        double a = 4195835, b = 3145727;
        printf("%f\n", a - (a / b) * b);
        return 0;
    }
    ues4:~/tmp 142> gcc -S p5bug.c
    ues4:~/tmp 143> cat p5bug.s
            .file   "p5bug.c"
    gcc2_compiled.:
    ___gnu_compiled_c:
    .text
    LC0:
            .ascii "%f\12\0"
            .align 4
    .globl _main
    _main:
            pushl %ebp
            movl %esp,%ebp
            subl $16,%esp
            call ___main
            movl $-1073741824,-8(%ebp)
            movl $1095762302,-4(%ebp)
            movl $-2147483648,-16(%ebp)
            movl $1095237631,-12(%ebp)
            fldl -8(%ebp)
            fdivl -16(%ebp)
            fmull -16(%ebp)
            fldl -8(%ebp)
            fsubp %st,%st(1)
            subl $8,%esp
            fstpl (%esp)
            pushl $LC0
            call _printf
            addl $12,%esp
            xorl %eax,%eax
            jmp L1
            .align 4,0x90
    L1:
            movl %ebp,%esp
            popl %ebp
            ret

Even without being an assembly language wizard, you can easily identify
the floating point instructions generated by the compiler.

And now, let's activate the first level of optimizations in gcc:

    ues4:~/tmp 144> gcc -S -O p5bug.c
    ues4:~/tmp 145> cat p5bug.s
            .file   "p5bug.c"
    gcc2_compiled.:
    ___gnu_compiled_c:
    .text
    LC0:
            .ascii "%f\12\0"
            .align 4
    .globl _main
    _main:
            pushl %ebp
            movl %esp,%ebp
            call ___main
            pushl $0
            pushl $0
            pushl $LC0
            call _printf
            xorl %eax,%eax
            movl %ebp,%esp
            popl %ebp
            ret

Poof!  No floating point instructions in the optimized version.  The
compiler has generated the equivalent of printf("%f\n", 0.0).  

Who said the Pentium chip is buggy?!  :-)

Dan
--
Dan Pop
CERN, CN Division

Mail:  CERN - PPE, Bat. 31 R-004, CH-1211 Geneve 23, Switzerland

 
 
 

Is Pentium bad?

Post by Alan L. Cass » Sun, 27 Nov 1994 08:26:05


: Poof!  No floating point instructions in the optimized version.  The
: compiler has generated the equivalent of printf("%f\n", 0.0).  

: Who said the Pentium chip is buggy?!  :-)

The obvious fix!  Don't do any divisions!  :->

 
 
 

Is Pentium bad?

Post by Lech Borkows » Sun, 27 Nov 1994 09:35:18





>>It should be zero, of course.  The result of pentium calculation
>>is 256!!!
>>So, I checked that (I use fortran) and resutl is 256 on my
>>Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
>>program with O2 optimization (f77 -O2 ...) Result this time is 0!
>>(as it should be).  
>>The results are independent of of single/double precision.
>>Anybody have any idea how is optimization doing this?
>You should post this to comp.sys.intel. I've been following the discussion
>there, although I don't qualify to comment on it, and I've seen
>no one mention that it worked, EVER, on pentiums with the bug. This
>should start a new, interesting, thread.

Optimization in f77 may lead to the correct result
because when optimizing the compiler may replace

x-(x/y)*y

with 0. I think someone already explained this in comp.sys.intel.

Lech Borkowski

 
 
 

Is Pentium bad?

Post by Morten J|h » Sun, 27 Nov 1994 22:12:01



>Hot subject about Pentiums these days is recently discovered
>error in floating point unit (article in NYT, Thursday, Nov. 24).
>Example published there is as follows:
>a=4195835
>b=3145727
>a-((a/b)*b) = ?
>It should be zero, of course.  The result of pentium calculation
>is 256!!!
>So, I checked that (I use fortran) and resutl is 256 on my
>Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
>program with O2 optimization (f77 -O2 ...) Result this time is 0!
>(as it should be).  
>The results are independent of of single/double precision.
>Anybody have any idea how is optimization doing this?

Yes. The pentium bug is in the FDIV division. The compiler
does a little algebra:

a-((a/b)*b) = a-(a*b/b) = a - a = 0

This is done without performing any divisions - it just uses the
algebraic laws.

Morten Joehnk

--
From the dictionary: recursive, adj: see recursive.

 
 
 

Is Pentium bad?

Post by Flemming Christens » Sun, 27 Nov 1994 23:54:50




> : Poof!  No floating point instructions in the optimized version.  The
> : compiler has generated the equivalent of printf("%f\n", 0.0).  

> : Who said the Pentium chip is buggy?!  :-)

> The obvious fix!  Don't do any divisions!  :->

You might actually have stumbled across a possible work-around here. :-)

Consider the following (back to elementary scool stuff):

1) Assume a number N exists that will always yield the correct result on
   a pentium when used as a nominator or denominator in a division with any
   other number.

2) Then replace the expression
       c = a/b
   by
       c = (a*N) / (b*N)
   which of course is the same as
       c = (a/N) * (N/b)

This should give the correct results (excluding some small rounding errors)
unless the pentium can't multiply correctly either.

I know that replacing a division by two divisions and a multiplication may slow
things down a bit, but at least the result will be correct :->

Any suggestions for the value of N ?

/fch

 
 
 

Is Pentium bad?

Post by Alan L. Cass » Mon, 28 Nov 1994 13:14:10


:        c = (a/N) * (N/b)
: This should give the correct results (excluding some small rounding errors)
: unless the pentium can't multiply correctly either.

: Any suggestions for the value of N ?

"Zero" comes to mind.   :-)

 
 
 

Is Pentium bad?

Post by Hussam Eas » Mon, 28 Nov 1994 15:02:50




: :        c = (a/N) * (N/b)
: : This should give the correct results (excluding some small rounding errors)
: : unless the pentium can't multiply correctly either.

: : Any suggestions for the value of N ?

: "Zero" comes to mind.   :-)

How about Pi? :-)(-:

 
 
 

Is Pentium bad?

Post by Hallvard Pauls » Mon, 28 Nov 1994 22:11:31


|> Hot subject about Pentiums these days is recently discovered
|> error in floating point unit (article in NYT, Thursday, Nov. 24).
|> Example published there is as follows:
|> a=4195835
|> b=3145727
|>
|> a-((a/b)*b) = ?
|>
|> It should be zero, of course.  The result of pentium calculation
|> is 256!!!
|> So, I checked that (I use fortran) and resutl is 256 on my
|> Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
|> program with O2 optimization (f77 -O2 ...) Result this time is 0!
|> (as it should be).  
|> The results are independent of of single/double precision.
|>
|> Anybody have any idea how is optimization doing this?

I tried using a C version of the test program with and
without optimization and it turns out that for any level
of optimization (-O1 to -O6) on this 90Mhz Pentium the
result is the correct 0.0000, without the -O option the
result is 256.000. This is of cource runing gcc 2.5.8 under
linux.

I also tried using my old TC version 2.0 optimized for speed,
but the result was wrong all the time. (I'm waiting for my
co-worker to try it under Visual C to se if this has any thing
to do with the OS or compiler). Maybe this is yet another
reason to make the switch from DOS to Linux?

(Pentium Power under DOS is kind of silly anyway.)

Hallvard P.

 
 
 

Is Pentium bad?

Post by Alan L. Cass » Tue, 29 Nov 1994 00:35:58


: I tried using a C version of the test program with and
: without optimization and it turns out that for any level
: of optimization (-O1 to -O6) on this 90Mhz Pentium the
: result is the correct 0.0000, without the -O option the
: result is 256.000. This is of cource runing gcc 2.5.8 under
: linux.
: I also tried using my old TC version 2.0 optimized for speed,
: but the result was wrong all the time. . . .Maybe this is yet another
: reason to make the switch from DOS to Linux?

There may be plenty of reasons that one might wish to switch to Linux from
DOS, but the particular reason given above may not necessarily be one of
them.  The reason is that the optimization by gcc only hides the problem
(which is in the processor, not in gcc).  The problem will still occur in
instances in which gcc cannot optimize the division out of the emitted
object code.  

On the other hand, if the division were done correctly in the first place,
you wouldn't be able to tell the difference between the optimized and
unoptimized code by the numerical results, anyway.

I have heard that gcc may be modified soon to emit code that works around
the Pentium bug.  Most compilers may have to incorporate such a workaround
in the near future.  

Therefore, what may, in fact, be good reasons to switch to Linux are the
facts that gcc is likely to be one of the first compiler to incorporate a
work-around, and that most Linux code is distributed as source code that
can be recompiled to incorporate the work-around without waiting for a bug
fix from each publisher.

 
 
 

Is Pentium bad?

Post by Alan L. Cass » Tue, 29 Nov 1994 00:46:41




: > :        c = (a/N) * (N/b)
: > : This should give the correct results (excluding some small rounding errors)
: > : unless the pentium can't multiply correctly either.
: > : Any suggestions for the value of N ?

: > "Zero" comes to mind.   :-)

: Come on now, you can do better than that!
: Zero is not really a "viable" solution (sorry, I couldn't resist). :-)

But you've got to admit that it very neatly avoids the problem of "small
rounding errors!"

--------
"Slimy, yet satisfying!" - Simba, the lion, after eating some bugs

 
 
 

Is Pentium bad?

Post by Flemming Christens » Mon, 28 Nov 1994 22:45:23




> :        c = (a/N) * (N/b)
> : This should give the correct results (excluding some small rounding errors)
> : unless the pentium can't multiply correctly either.

> : Any suggestions for the value of N ?

> "Zero" comes to mind.   :-)

Come on now, you can do better than that!
Zero is not really a "viable" solution (sorry, I couldn't resist).
:-)

/fch

 
 
 

Is Pentium bad?

Post by R.D. Auchterloun » Wed, 30 Nov 1994 00:25:33


[...]

Quote:>Consider the following (back to elementary scool stuff):
>1) Assume a number N exists that will always yield the correct result on
>   a pentium when used as a nominator or denominator in a division with any
>   other number.
>2) Then replace the expression
>       c = a/b
>   by
>       c = (a*N) / (b*N)
>   which of course is the same as
>       c = (a/N) * (N/b)

Afraid you've been beaten to it. The MathWorks Inc. have already announced
/ released a pentium fixed binary, using what looks like exactly the same
technique.

Quote:>This should give the correct results (excluding some small rounding errors)
>unless the pentium can't multiply correctly either.
>I know that replacing a division by two divisions and a multiplication may slow
>things down a bit, but at least the result will be correct :->

It's not neccessary to replace every division - see below.

Quote:>Any suggestions for the value of N ?

3/4 - read on...

Quote:>/fch

Included below is the post from comp.sys.intel (I've included it as it may have
expired by now at some sites) describing the method.

All credit to MathWorks - this is a pretty quick response time to a bug
whih isn't even their fault...

ray


-> Date: Thur Nov 24 22:00:20 EST 1994
-> Subject: Soft/Hardware Workaround for the FDIV bug
-> Newsgroups: comp.sys.intel
->
->         SOFTWARE/HARDWARE WORKAROUND FOR THE FDIV BUG
->
-> At the MathWorks, we have decided to issue a new release of
-> MATLAB which is "Pentium aware".  It incorporates a workaround
-> for the floating point division bug which restores full accuracy
-> without a serious degradation of efficiency.  And, we have
-> decided to post a description of our technique to the Internet
-> so that other people, including other commercial software
-> developers, can make use of it.
->
-> It is not easy to modify a large package like MATLAB to include
-> this change.  We would much prefer to have the compiler or the
-> operating system do the job for us, but this is not yet an
-> option.  The kernel of MATLAB is written in C.  We have replaced
-> several dozen instances of a '/' denoting a floating point
-> division by a function call.  We are now in the process of
-> confirming that we didn't introduce any source code errors
-> while doing this.
->
-> Our initial intention was to use the Pentium hardware FDIV
-> instruction to compute a candidate quotient, check its accuracy,
-> and then, if necessary, employ a standard Newton iterative
-> refinement technique to eliminate any inaccuracy produced
-> by the hardware.  But then we realized that an approach unique
-> to this situation was possible, and more effective.  The FDIV
-> instruction can be used to correct itself!  Here's the code:
->
->      #include <math.h>
->      #define EPS      2.2204460492503131e-16
->      #define REALMIN  2.2250738585072014e-308
->      #define RHO      0.75
->      
->      double fdiv(double x, double y)
->      {
->         int ok;
->         double r,z;
->         z = x/y;
->         r = x - y*z;
->         while (fabs(r) > EPS*fabs(x)+REALMIN) {
->            x = RHO*x;
->            y = RHO*y;
->            z = x/y;
->            r = x - y*z;
->         }
->         return(z);
->      }
->
-> The idea is to use the hardware to divide x by y and produce the
-> candidate quotient, z.  This will be correct in the vast majority
-> of cases, but it is necessary to check.  The test uses the residual,
-> r = x - y*z.  Normally, r will be no larger than roundoff error in x.
-> The constant EPS involved in the test is the distance from 1.0
-> to the next larger floating point number.  For double precision,
-> EPS is 2^(-52).  At first, we forgot to include the constant
-> REALMIN in the test, but this term is required to deal correctly
-> with denormal floating point numbers.
->
-> If the residual fails the test, it indicates that the hardware bug
-> has been encountered.  When this occurs, we simply rescale the
-> numerator and denominator by a factor of 3/4 and repeat the process.
-> The scaling scrambles the bit patterns in x and y so that the second
-> division almost certainly gives a satisfactory result.
->
-> The factor 3/4 is important.  It is the "simplest" factor which
-> alters the bit patterns in the fractions of x and y.  If the last
-> bit in the fraction is zero, the scaling does not introduce any
-> roundoff error.  Furthermore, since 3/4 is less than 1, the
-> scaling cannot overflow.  And, if the operands are so small that
-> scaling by 3/4 would underflow, the original division is done
-> correctly and the scaling is not needed.
->
-> How much does all this cost in execution speed?   We have not yet
-> done any serious timings outside the MATLAB context.  Since the
-> occurrence of the inaccurate results is so rare, any time spent
-> inside the while loop can be ignored.  The FDIV instruction itself
-> takes 30 or more cycles.  To this we must add the time for
-> computing the residual, the time for doing the test and, perhaps
-> the most significant, the time required by the function call.  We
-> guess that this might double the time, but the actual factor will
-> depend upon the surrounding environment.
->
-> If anybody actually does incorporate this approach in their
-> software, we'd like to hear about it.  In particular, we'd
-> appreciate hearing about any timing measurements from other
-> large packages.
->
-> I will be posting an article to the comp.soft-sys.matlab newsgroup
-> describing the Pentium aware MATLAB, and its availability, in
-> more detail.
->
-> Finally, although I have not yet met any of them in person,
-> I would like acknowledge and thank Thomas Nicely, Tim Coe and
-> Mike Carlton for their contributions to this enterprise.
->
->   -- Cleve Moler
->   Chairman and Chief Scientist
->   The MathWorks, Inc.

->

 
 
 

Is Pentium bad?

Post by Mark C. Chu-Carro » Wed, 30 Nov 1994 03:35:44



>Hot subject about Pentiums these days is recently discovered
>error in floating point unit (article in NYT, Thursday, Nov. 24).
>Example published there is as follows:
>a=4195835
>b=3145727

>a-((a/b)*b) = ?

>It should be zero, of course.  The result of pentium calculation
>is 256!!!
>So, I checked that (I use fortran) and resutl is 256 on my
>Pentium (90 MHz).  Too bad. But not so bad. I compiled the same
>program with O2 optimization (f77 -O2 ...) Result this time is 0!
>(as it should be).  
>The results are independent of of single/double precision.

>Anybody have any idea how is optimization doing this?

Sure, it's simple.

There's a standard opt called constant folding. What it does is
propogate constants through the program, and pre-execute any
expressions that are entirely constant. So that expression is just
being optimized away by the compiler.

The reason that it doesn't result in a bug is that the compiler
writers didn't waste time on figuring out whether or not there's an
FPU available - they just do floats by emulation (or whatever). So it
doesn't get run on the buggy pentium fpu.

        <MC>

--

 
 
 

1. Is it as bad as I think . . . or am I just paranoid?

Hi

I'm a Linux newbie, setting up a RedHat dedicated server,
leased from a provider in Florida. I've also got a
'practice' machine here in my office.

I've been forced into moving to the dedicated server,
by the growth of a forum on one of my sites. My current
web host provides a 'virtual root' setup on a somewhat
overloaded shared server. It does OK with the static pages,
but my forum wants more CPU cycles then it's getting, and
the only solution seemed to be a dedicated machine.

My problem is this: to my untrained imagination, the 'stock'
set up on my new dedicated server seems to be rather insecure.
For example, the access.conf file includes these directives:

 <Directory />
  Options IncludesNOEXEC MultiViews FollowSymLinks ExecCGI Indexes
  AllowOverride All
 </Directory>

They've done this to allow their somewhat kludged up 'server
interface' to work, since it's scattered around the system
in various places.

Also, not only did they set up Telnet initial access
to require the root password, but their trouble ticket
and support pages requires submission of the root password
via unencrypted HTTP, even though they have HTTPS setup.

What I know about hacking wouldn't fill a sheet of foolscap,
but it strikes me that this company may be setting up
servers that are hacker heaven -- lots of insecure machines,
operated by newbies like me, just waiting to be plucked!

Am I overreacting, or just ignorant, or do I really have
something to worry about?

I've asked a couple of the techs, and one of them has admitted
that they have some gateways hacked, and some password capture
occur in the past, but he wouldn't say more.

At this point, my thinking is that I need to play around with
the machine a bit more. But, I'm thinking that before I actually
'go live', I'd better have them repartion and install RH 7.0
(it's currently 6.1) with no 'server interface' . I can change
the root password 10 minutes after they finish, install all updates,
Tripwire, eliminate their server interface, and . . . ?

Assuming that the problem is real, is that enough for reasonable
protection, or do I need to go further?

Unfortunately, there is some real hostility out there to
my websites. I'm just a swimming pool guy, trying to help
folks have more fun with their pools and cover my costs
while doing so. But, some pool dealers (and some BIG chemical
companies) object strenuously to my explaining things like
how to buy pool chemicals, that sell for $1.50 - $6.00/lb,
at the grocery store for $0.35 to $1.00/lb. I get

and go downhill from there, as well as the occasional
psuedo-lawyer-letter. But, my greatest fear is that some
pissed off pool dealer has a precocious hacker son who
decides to entertain himself taking down some of my websites!

Any advice? Things to read? Programs to use?

TIA,

Ben

2. Problems with LAN

3. 014 Bad Bad Bad !!! for Linux

4. new hardisk problem...

5. Bad, bad, bad VM behaviour in 2.4.10

6. Has Linux ever supported the COFF binary format ?

7. Am my linux box hack by the bad guy

8. General.NFS.Questions

9. Bad driver...Bad bad driver

10. BAD SUPER BLOCK hang... how hosed am I?

11. Bad, bad, bad error...

12. Am my linux box hack by the bad guy

13. SoftwarBuyLine.com is bad, bad, bad...