IEEE FP and Linux

IEEE FP and Linux

Post by Peter Dalgaard BS » Thu, 17 Oct 1996 04:00:00




> [...] I'm porting a unix application from
> Solaris to Linux. Everything works fine (the application is a superset
> of prolog) but for one thing.  Every single time I try to do floating
> point operation it crashes with an exception, the operation being
> "denormalized". This property of floating point number does not appear
> on the solaris (sparc architecture). I can find the flags to mask this
> ieee exception, but when I do so, the program does not work anymore
> (looks like comparsion of floating point fail miserably, returning wrong
> results). Anyway, the application is ported to win NT and runs
> perfectly, so it is not a hardware specific problem. So my questions:
> 1) What does it mean to be denormalized ?
> 2) How can I fix a "denormalized floating point" ?
> 3) What mask am I supposed  to use ?
> 4) Are there special compilation flags that I should pass to gcc to get
> a correct behavior ?
> 5) What is going on ?

Hm. Normalization is just the process of (roughly speaking)
representing numbers as 0.234*10^6 rather that 0.000234*10^9 or
0.0234*10^7. In hardware binary, this corresponds to making sure the
leftmost bit of the mantissa is always 1, which can then be assumed,
giving an extra bit of accuracy. However, operations on normalized
numbers can give a denormalized result and continuing calculations
with a denormalized operand without normalizing it  will get you in
trouble.  

The FPU normally takes care of that, although there is a special
situation at the low end of the representable range (running out of
bits for the exponent).

My first guess would be that you have a stray pointer or something,
actually overwriting the values you need to compute on.

Try isolating the calculations that go wrong, put in some test
printouts and show us exactly what blows up. Also include
kernel/compiler/libc/libm version.

--
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N  
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918

 
 
 

IEEE FP and Linux

Post by Rene van Paasse » Thu, 17 Oct 1996 04:00:00



> Hi guys!

> I have a little problem, I already posted a while ago in this group but
> never got any answers. Here it is: I'm porting a unix application from
> Solaris to Linux. Everything works fine (the application is a superset
> of prolog) but for one thing.  Every single time I try to do floating
> point operation it crashes with an exception, the operation being
> "denormalized". This property of floating point number does not appear
> on the solaris (sparc architecture). I can find the flags to mask this
> ieee exception, but when I do so, the program does not work anymore
> (looks like comparsion of floating point fail miserably, returning wrong
> results). Anyway, the application is ported to win NT and runs
> perfectly, so it is not a hardware specific problem. So my questions:
> 1) What does it mean to be denormalized ?
> 2) How can I fix a "denormalized floating point" ?
> 3) What mask am I supposed  to use ?
> 4) Are there special compilation flags that I should pass to gcc to get
> a correct behavior ?
> 5) What is going on ?

Hi Laurent,

That's little to go on, but here are a few steps

        1 compile with gcc -g , to get debugging information
        2 run with gdb <your-program-here>
          and just type 'run' to the gdb prompt.

I did plenty of compilation on SUN-OS, both with cc and gcc, and it
always surprised me how permissive cc really is. To find a lot of errors
and mistakes in the compilation stage, just try: gcc -Wall

Floating point numbers have a part for the mantissa, the number like
0.4553221232 or something, in a computer it is of course in binary, and
for the exponent, +12 or something like that. Let's do the example in
decimal for the sake of argument. If you had ten places for the number,
then a number like 0.0000045532+17 would be the denormalized version of
0.4553221232+12 .You see you lose precision.

Of course, your floating point unit is not making up denormalized fp
numbers for fun. My guess is that your denormalized fp number comes from
using a non-initialised variable (the easiest case to check with gcc),
accidentally overwriting (part of) your floating point variable or using
something that is not a floating point at all.

In other words, maybe you just encountered a bug in your program.

Hope this helps,

        Rene

--
Paper towels for aliens (min. 3 arms):         LOADING
----------------------------------|   pull out both knobs and turn

Department of Automation          |   Kimberly-Clark Ltd
Technical University of Denmark   |   disposable paper products

 
 
 

IEEE FP and Linux

Post by Laurent Miche » Thu, 17 Oct 1996 04:00:00


Hi guys!

I have a little problem, I already posted a while ago in this group but
never got any answers. Here it is: I'm porting a unix application from
Solaris to Linux. Everything works fine (the application is a superset
of prolog) but for one thing.  Every single time I try to do floating
point operation it crashes with an exception, the operation being
"denormalized". This property of floating point number does not appear
on the solaris (sparc architecture). I can find the flags to mask this
ieee exception, but when I do so, the program does not work anymore
(looks like comparsion of floating point fail miserably, returning wrong
results). Anyway, the application is ported to win NT and runs
perfectly, so it is not a hardware specific problem. So my questions:
1) What does it mean to be denormalized ?
2) How can I fix a "denormalized floating point" ?
3) What mask am I supposed  to use ?
4) Are there special compilation flags that I should pass to gcc to get
a correct behavior ?
5) What is going on ?

Any help, pointers, comments, reference, would be greatly appreciated.
Note that Linux is the only unix that does not run.... and the point of
this porting is to show which OS are reasonable for getting some real
work done. It is really sad that the NT port goes through while Linux
stays behind. I'd really like to get it working. For the record, I'm
running gcc 2.7.2 on a Linux 2.0.22 system with the latest stable libc
and other libs (libc is 3.5.something) I upgraded a couple of times my
libc with no success. I read about glibc but I've also seen posts from
people having difficulty getting it to compile.....

--

        - Laurent

 
 
 

IEEE FP and Linux

Post by Paul Caprio » Thu, 17 Oct 1996 04:00:00




Quote:>I have a little problem, I already posted a while ago in this group but
>never got any answers. Here it is: I'm porting a unix application from
>Solaris to Linux. Everything works fine (the application is a superset
>of prolog) but for one thing.  Every single time I try to do floating
>point operation it crashes with an exception, the operation being
>"denormalized".
>1) What does it mean to be denormalized ?

This is also known as gradual underflow.  The smallest normalized
IEEE double precision number is (from memory) 1.0*2^(-1022)
This is normalized since the significand (mantissa) has a one in
front of the binary point.
Denormalized numbers allows you to represent values smaller than this,
although with a loss of precision.  Since there are 52 bits for the
mantissa, the smallest denormalized number is 2^(-52)*2^(-1022).
That is, (in binary) 0.000.....001 * 2^(-1022)
This number has only one significant digit, so relative round-off
error ain't good.  (This was a controversial part of the IEEE standard.)

Quote:>4) Are there special compilation flags that I should pass to gcc to get
>a correct behavior ?

Intel chips use 80bit floating point registers, so you might be getting
more precision on those chips than on other architectures.
You can ask gcc to refrain from optimizing away memory stores using the
compile flag -ffloat-store.  This makes sure that code like c=a+b
really does have it's result stored in memory (c) rather than just
keeping the result on the fpu.  Since the fpu is 80bits and double c
is 64bits, this can make a difference.

Also, you can try

#ifdef __linux__
#include <fpu_control.h>
#endif

and then

#ifdef __linux__
 __setfpucw( (unsigned short)((_FPU_DEFAULT&0xF0FF) | _FPU_DOUBLE) );
#endif

to set the the fpu control word to the the precision you want.  The
default is 80bit (double-extended).  Change _FPU_DOUBLE to _FPU_SINGLE
if you want single precision.

The above is for intel linux.  I don't know how to do this for DEC or
Sparc linux.

I have no idea what the defaults are under NT.

--Paul

P.S. References:

"What every computer scientist should know about floating point arithhmetic"
by David Goldberg, ACM computing surveys, vol 23 no 1, march 1991

Computer Architecture: A Quantitative Approach by David Patterson and
John Hennessy, Morgan Kaufmann Publishers.

 
 
 

IEEE FP and Linux

Post by Joe Pfeiff » Thu, 17 Oct 1996 04:00:00


   I have a little problem, I already posted a while ago in this group but
   never got any answers. Here it is: I'm porting a unix application from
   Solaris to Linux. Everything works fine (the application is a superset
   of prolog) but for one thing.  Every single time I try to do floating
   point operation it crashes with an exception, the operation being
   "denormalized". This property of floating point number does not appear
   on the solaris (sparc architecture). I can find the flags to mask this
   ieee exception, but when I do so, the program does not work anymore
   (looks like comparsion of floating point fail miserably, returning wrong
   results). Anyway, the application is ported to win NT and runs

Somebody else already posted about what ``normalized'' means.  Both
the Sparc and the Intel use the same floating point representation
(the IEEE standard -- I'm sure there are differences that can be drawn
between the two implementations, but they're certainly close enough
not to worry about), and both require normalized numbers.  The only
time this is not the case is for very, very small numbers...  it turns
out that most integers in common use have the same bit patterns as
denormalized floating point numbers (note that I am not saying
integers are denormalized fp numbers or something else ridiculous like
that:  I'm saying that there is a coincidence in the bit patterns for
reasonably small integers and for really, really small floating point
numbers), so the most likely problems are (1) bad pointer over-writing
data (as somebody else pointed out) or (2) trying to do floating point
operations on integers without casting.

   stays behind. I'd really like to get it working. For the record, I'm
   running gcc 2.7.2 on a Linux 2.0.22 system with the latest stable libc
   and other libs (libc is 3.5.something) I upgraded a couple of times my
   libc with no success. I read about glibc but I've also seen posts from
   people having difficulty getting it to compile.....

Are you using gcc for the other systems?
--
Joseph J. Pfeiffer, Jr., Ph.D.       Phone -- (505) 646-1605
Assistant Professor                  FAX   -- (505) 646-1002
Department of Computer Science      
New Mexico State University          RIP Seymour Cray 1925-1996
Las Cruces, NM 88003                 The passing of a giant.
http://www.cs.nmsu.edu/~pfeiffer
--
Joseph J. Pfeiffer, Jr., Ph.D.       Phone -- (505) 646-1605
Assistant Professor                  FAX   -- (505) 646-1002
Department of Computer Science      
New Mexico State University          RIP Seymour Cray 1925-1996
Las Cruces, NM 88003                 The passing of a giant.
http://www.cs.nmsu.edu/~pfeiffer

 
 
 

IEEE FP and Linux

Post by Tormod Wi » Fri, 18 Oct 1996 04:00:00



>Hi guys!
>I have a little problem, I already posted a while ago in this group but
>never got any answers. Here it is: I'm porting a unix application from
>Solaris to Linux. Everything works fine (the application is a superset
>of prolog) but for one thing.  Every single time I try to do floating
>point operation it crashes with an exception, the operation being
>"denormalized". This property of floating point number does not appear
>on the solaris (sparc architecture). I can find the flags to mask this
>ieee exception, but when I do so, the program does not work anymore
>(looks like comparsion of floating point fail miserably, returning wrong
>results). Anyway, the application is ported to win NT and runs
>perfectly, so it is not a hardware specific problem.

You are not telling what hardware platform you are running on, but
I'll assume it is an Intel based PC, and I'll try to give you some
answers which hopefully will be helpfull solving your problems.

Quote:> So my questions:
>1) What does it mean to be denormalized ?

I've got an Intel microprocessor handbook specifying amongst others
the Intel Math co-processor. According to this the denormal exception
is cuased by one of the operands being denormalized, ie. "it has the
smallest exponent but a nonzero significand". By the way, are you
using the hardware floating point unit, or some kind of -87 emulation
library ?
Quote:>2) How can I fix a "denormalized floating point" ?

Again, according to the book: the -87 coprocessor "automatically
normalizes denormal operands when the denormal exception is masked".
That is, when unmasked, the programs denormalize "exception handler"
will be called upon the exception, in your case, just terminating the
program, saying the operation being "denormalized". There is obviously
some problems with the interpretation of the operand formats.  In the
Intel world MSB is the highet addressed byte and the format of a
floating point value should be(single precision):
|31                  23|                           0|
| S | exponent     |  significand          |
(S=sign of exponent)
The "normalized" exponent is biased with a value of 127 (0x7F).  

But all this should have been taken care of by the compiler ........
hmmm......

Quote:>3) What mask am I supposed  to use ?
......
>4) Are there special compilation flags that I should pass to gcc to get
>a correct behavior ?

I'm not so familiar with gcc that I can give you a proper answer
here......
Quote:>5) What is going on ?

........

What about running a de* on this (gdb ?). Check formats and
values on real numbers.

I am not sure if this is of any help to you, perhaps you can get
better answers from others, more experienced linux users. I am quite
new on linux, but I have worked several years with Intel kind of
stuff, and are pretty familiar with their processor family.
Good luck.

Tormod Wien

 
 
 

IEEE FP and Linux

Post by Andrey V Khavryutchenk » Mon, 21 Oct 1996 04:00:00




[skip]
> >2) How can I fix a "denormalized floating point" ?
> Again, according to the book: the -87 coprocessor "automatically
> normalizes denormal operands when the denormal exception is masked".
> That is, when unmasked, the programs denormalize "exception handler"
> will be called upon the exception, in your case, just terminating the
> program, saying the operation being "denormalized". There is obviously

           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Is this the way the linux(gnu) libc does?

SY
--
Andrey V Khavryutchenko

Interests: Computational Chemistry, Nanotech, OOA&OOP, The Net

Quote of the day:
  If you make people think they're thinking, they'll love you; but if you
  really make them think they'll hate you.

 
 
 

1. IEEE FP and Linux!!!!!!!!!!!!!

Just a short post to thank you all for your advice and opinions, The
problem is solved thanks to  Hans B. Kuhlmann. The solution was fairly
simple..... A compilation flag was expected in order to turn on the
correct compilation of the ieee dependent instruction. This option is
-mieee-fp.

I thought that some of you might be interested because this option is an
"undocumented" feature of gcc! The incorrect behavior that result if you
do *not* use this option is potentially incorrect floating point
comparisons when the operand tend to be denormalized!) It works fine,
but it is not in the man pages!!! Actually, Hans gave me this http
address that documents the feature:

http://www.cygnus.com/library/gcc/usegcc_toc.html

Here is what Hans told me after I told him that he solved the problem:

Again, thanks for your help.

--

        - Laurent

2. nfs daemon error!

3. How to remove Apache/FP 1.2.5 and FP 3.0 extensions

4. tape cartridge formatting ?

5. Matrox Mystique ands X.

6. lmbench on 2 Linux and 3 Sun boxes

7. Front Page Extensions for Linux: How to get a fp web published?

8. 10 GB fixed disk: not all accessible?

9. Apache, Linux and FP 2K ext server

10. How do I install FP for Apache on Linux

11. Help -- Installing FP on Linux and Apache 1.3.3

12. DU <--> Linux FP exceptions with DU binaries

13. poor fp performance on alpha/linux/gcc