> > I am working on code to implement IEEE 754-complient floating point in
> > software (specificially in Java, but that does not matter). I am
> > using the "testfloat" and "softfloat" programs to test my code and I
> > have come up to a few examples that I cannot explain.
> > Assume 64-bit double format is in use and consider the operation
> > (0.00...01 * 2^-1023) + (0.11...11 * 2^-1023). This is the smallest
> > possible positive number added to the largest possible subnormal
> > number. The result should be (1.00...00 * 2^-1023). The problem here
> > is that an exponent of -1023 indicates a subnormal number, which must
> > have a zero in the most significant bit of the significand. Hence my
> > code outputs zero for a result and does not set any flags. The "real"
> > answer, given by "testfloat" as well as an IA-32 and a Sun system, is
> > the number (1.00..00 * 2^-1022) with no exception flags thrown. It
> > seems they round the number up so that it is normalized, but if that
> > were the case, should not an inexact exception be thrown?
> > Am I missing something completly obvious here, or is this some kind of
> > flaw in the IEEE 754 standard?
> IEEE 754 has this to say about "inexact":
> "If the rounded result of an operation is not exact or if it overflows
> without an overflow trap, then the inexact exception shall be signaled."
> In this particular case, there is no overflow, and the result is exact,
> i.e. the result does not differ from the result one would have gotten
> using arithmetic employing infinite precision and unbounded exponent
> I assume you meant to write "The result should be (1.00...00 * 2^-1022)"
> not "... (1.00...00 * 2^-1023)" ? The expected sum of smallest denormal
> and largest denormal is the smallest normal, as you observed on the test
No, I did not mean to write this, and this is the exact source of my
confusion. Why is the exponent -1022 and not -1023? The problem,
simplified, is essentially the following, is it not?
0.111 * 2^0 = 0.875
+0.001 * 2^0 = 0.125
1.000 * 2^0 1.000
The exponents in the operands are equal going into the operation and
the result does not carry into the next highest place, so no shifting
or changing of exponents should occur. If some shifting or rounding
is necessary to make the number representable in a given format, than
why isn't the number considered "inexact"?