Hi,

I'm working on my dissertation and I have come across a crazy quirk
in MATLAB that I was willing to write off before but has happened
twice now and it's REALLY starting to make me angry.

I am trying to group some row vectors based on the value in their
first column. Simple enough, right? Here's the snippet of code I'm
working with:

Quote:> [ratio cutoff{cut}]
> if (ratio <= cutoff{cut}) & (ratio > cutoff{cut} - 0.1)
> disp('OK!')

now, when I run the program, i get the following outputs:

ans =
0.3000 0.3000

ans =
0.3000 0.3000

ans =
0.2000 0.3000

OK!

ans =
0.4000 0.3000

and so on. The correct value of ratio NEVER sets off the condition,
but the incorrect value always does. The plot thickens:

ans =
0.2000 0.2000

OK!

ans =
0.4000 0.4000

OK!

I have tried everything I can think of, and as ridiculous as it may
seem, it appears to me that some MATLAB functions are actually
incapable of recognizing the number 0.3 correctly. As I mentioned, I
had a similar problem about a month ago, with a very similar line of
code and again, the problem value was 0.3. I have messed around with
some other commands using that value, like adding and subtracting it,
using find to seek out that value in a matrix, and it works just
fine.

please let me know. I haven't seen it listed in any troubleshooting
notices or bug reports. I am not an idiot, I have been programming
MATLAB for about two years so I'm pretty confident in my abilities,
and I am positive I have not made a mistake. I am becoming EXTREMELY
angry, though, like I don't have enough else to worry about without a
piece of mathematical software that CAN'T RECOGNIZE A NUMBER!!
Maybe it's even a processor problem? Any help would be greatly
appreciated. Thanks.

Mark

my guess, a floating point representation problem:

it very well may be that you create your vals differently, e.g.,

(.1:.1:.4)
% ans =
% 0.1 0.2 0.3 0.4
(.1:.1:.4)-.3
% ans =
% -0.2 -0.1 5.5511e-017 0.1

look at <peter boettcher>'s faq:

<http://www.mit.edu/~pwb/cssm/matlab-faq.html#eps>

us

> I'm working on my dissertation and I have come across a crazy quirk
> in MATLAB that I was willing to write off before but has happened
> twice now and it's REALLY starting to make me angry.

> my guess, a floating point representation problem:

> it very well may be that you create your vals differently, e.g.,

>      (.1:.1:.4)
> % ans =
> % 0.1 0.2 0.3 0.4
>      (.1:.1:.4)-.3
> % ans =
> % -0.2 -0.1 5.5511e-017 0.1

> look at <peter boettcher>'s faq:

>  <http://www.mit.edu/~pwb/cssm/matlab-faq.html#eps>

Let me add that your specific problem is that you are relying on
equality tests for floating point numbers (the problems happen right
on the = of >=).  Just bear in mind that any floating point number
may have a +/- small amount of error, and write your program
accordingly.  For instance, instead of testing (a==b), test
(abs(a-b)<tol).

--

MIT Lincoln Laboratory
MATLAB FAQ: http://www.mit.edu/~pwb/cssm/

> ..snip.. posting about fp tests ...

Other replies have answered the basic question so I'll only
demonstrate--

? format long
? x=[.1:.1:.5];
? x(3)
ans =
3.000000000000000e-001
? x(3)==0.3
ans =
0
? x(3)-0.3
ans =
5.551115123125782e-017
?

HTH...

Which, however, does raise an interesting question--what does Matlab do
internally on find() -- is there a tolerance?

? find(x==0.3)
ans =
[]
? y=x(3)
y =
3.000000000000000e-001
? find(x==y)
ans =
3

So, the answer is also "no" there.

Which raises the question, <should> find() use a tolerance?

>> ..snip.. posting about fp tests ...

> Other replies have answered the basic question so I'll only
> demonstrate--

*snip demonstration*

Quote:> Which, however, does raise an interesting question--what does Matlab
> do internally on find() -- is there a tolerance?

> ? find(x==0.3)
> ans =
>      []
> ? y=x(3)
> y =
>     3.000000000000000e-001
> ? find(x==y)
> ans =
>      3

> So, the answer is also "no" there.

> Which raises the question, <should> find() use a tolerance?

It wouldn't help.  The == operator returns an array containing either 0 or 1
and FIND operates on that array.  Unless the tolerance is greater than 1,
adding a tolerance to FIND would do nothing.

Now, follow-up questions:
Should "==" have a tolerance or should it remain the way it is, requiring
exact equality?
If it should have a tolerance, should that tolerance be able to be
modified by users?
If it should have a tolerance, what should that tolerance be by default?
Alternately, would an "approximately equal" operator be useful?  [It's
easy to do using ABS and <= ... but would you use it if something like this

I'm quite curious to see how people answer these questions.  I know how
various people here at The MathWorks would answer them, and I'd like to

--
Steve Lord

>   Should "==" have a tolerance or should it remain the way it is, requiring
> exact equality?
>   If it should have a tolerance, should that tolerance be able to be
> modified by users?
>   If it should have a tolerance, what should that tolerance be by default?
>   Alternately, would an "approximately equal" operator be useful?  [It's
> easy to do using ABS and <= ... but would you use it if something like this

== should always mean exactly equal.  One person's definition of
sort-of equal may not be anothers.  Mucking with ==, especially with
user parameters, sounds like a recipe for disaster.

'Approximately equal' sounds like a better idea.  But still,
newcomers to the world of floating point would still use == until
finding out why it doesn't work, so it would just be a shorthand for
those who already understand the limitations of floating point.

--

MIT Lincoln Laboratory
MATLAB FAQ: http://www.mit.edu/~pwb/cssm/

> Hi,

> I'm working on my dissertation and I have come across a crazy quirk
> in MATLAB that I was willing to write off before but has happened
> twice now and it's REALLY starting to make me angry.

> I am trying to group some row vectors based on the value in their
> first column. Simple enough, right? Here's the snippet of code I'm
> working with:

> > [ratio cutoff{cut}]
> > if (ratio <= cutoff{cut}) & (ratio > cutoff{cut} - 0.1)
> > disp('OK!')

> now, when I run the program, i get the following outputs:

> ans =
> 0.3000 0.3000

You're displaying numbers which are about 16-digits long
internally, but only looking at the values rounded to
4 digits. You don't know if the first value is really
less than, equal, or greater than the second. Try
executing the command "format long" then repeating
this test.

But the real problem is that floating point arithmetic
causes roundoff errors down in the last couple of bits.
You need to allow for a couple of bits of error in
your comparison. It is a standard rule of thumb that
floating-point comparisons should always allow a little
bit of "fuzziness". Try changing your test to this:

if ((ratio <= cutoff{cut}+1e-14) & (ratio > cutoff{cut} - 0.1)

My choice of 1e-14 may be conservative. It depends on how
many calculations went into these things. One floating
point calculation (a multiply or add) might introduce
an error of one bit (1 part in about 2^56). But those
errors accumulate over a sequence of calculations.

- Randy

i fully support peter's view.

don't mess with such basics - after all, it is just comparing
bit-patterns (all the way down) - and it is the responsibility of any
user to know/understand which patterns he/she puts up for comparison.

us

> == should always mean exactly equal. One person's definition of
> sort-of equal may not be anothers. Mucking with ==, especially
> with
> user parameters, sounds like a recipe for disaster.

Quote:>Now, follow-up questions:
>  Should "==" have a tolerance or should it remain the way it is, requiring
>exact equality?

NO!  LEAVE "==" ALONE!  (Yes, I mean to shout.)

Quote:>  Alternately, would an "approximately equal" operator be useful?  [It's
>easy to do using ABS and <= ... but would you use it if something like this

If the tolerance is some system parameter then don't do this.  It'll
just be confusing.  The tolerance must be explicit every time.  This
means that the "approximately equal" function must have three operands
and so it must be a function and not an operator.

I'd say leave this whole issue the way it is.  We're just going to have
to educate people as to how floating point numbers work.  There is no
substitute.

--
Doug Schwarz
Eastman Kodak Company

> Hi,

> I'm working on my dissertation and I have come across a crazy quirk
> in MATLAB that I was willing to write off before but has happened
> twice now and it's REALLY starting to make me angry.
[snip]
> please let me know. I haven't seen it listed in any troubleshooting
> notices or bug reports. I am not an idiot, I have been programming
> MATLAB for about two years so I'm pretty confident in my abilities,
> and I am positive I have not made a mistake. I am becoming EXTREMELY
> angry, though, like I don't have enough else to worry about without a
> piece of mathematical software that CAN'T RECOGNIZE A NUMBER!!
> Maybe it's even a processor problem? Any help would be greatly
> appreciated. Thanks.

Just to add one final comment: It is not Matlab that is
"at fault", nor is it a "processor problem". This is a fundamental
problem with using computers to do real arithmetic. Unless you
do symbolic calculation, numbers aren't exact. You must assume
that values can have errors down in the last few bits, and
you  You won't find such stuff covered in bug reports. You
*will* find it covered in chapter 1 of any numerical methods text.

- Randy

> > ..snip.. posting about fp tests ...

> Other replies have answered the basic question so I'll only
> demonstrate--

> ? format long
> ? x=[.1:.1:.5];
> ? x(3)
> ans =
>     3.000000000000000e-001
> ? x(3)==0.3
> ans =
>      0
> ? x(3)-0.3
> ans =
>     5.551115123125782e-017
> ?

> HTH...

> Which, however, does raise an interesting question--what does Matlab do
> internally on find() -- is there a tolerance?

> ? find(x==0.3)
> ans =
>      []
> ? y=x(3)
> y =
>     3.000000000000000e-001
> ? find(x==y)
> ans =
>      3

> So, the answer is also "no" there.

> Which raises the question, <should> find() use a tolerance?

Find is binary. The == operator returns a logical array,
which only has two possible values. So I'd say the proper
question is, should "==" use a tolerance?

APL often comes up in such discussions, and as I recall
the comparison operator did indeed come with a tolerance
called something like FUZZ which could be altered by
the user (I forget how). It seems that if you want this
"==".

The OP's problem wouldn't be solved by this, since he was
doing a ">=" test. You'd need to similarly fuzzify >=
and <= for doubles. I don't know if I'd want that without
my knowledge.

- Randy

> >> ..snip.. posting about fp tests ...

> > Other replies have answered the basic question so I'll only
> > demonstrate--

> *snip demonstration*

> > Which, however, does raise an interesting question--what does Matlab
> > do internally on find() -- is there a tolerance?

> > ? find(x==0.3)
> > ans =
> >      []
> > ? y=x(3)
> > y =
> >     3.000000000000000e-001
> > ? find(x==y)
> > ans =
> >      3

> > So, the answer is also "no" there.

> > Which raises the question, <should> find() use a tolerance?

> It wouldn't help.  The == operator returns an array containing either 0 or 1
> and FIND operates on that array.  Unless the tolerance is greater than 1,
> adding a tolerance to FIND would do nothing.

Yes, I wrote too quickly--I realize it is "==" that does the comparison,
I just carried over the demo w/ find() too far...

Quote:> Now, follow-up questions:
>   Should "==" have a tolerance or should it remain the way it is, requiring
> exact equality?
>   If it should have a tolerance, should that tolerance be able to be
> modified by users?
>   If it should have a tolerance, what should that tolerance be by default?
>   Alternately, would an "approximately equal" operator be useful?  [It's
> easy to do using ABS and <= ... but would you use it if something like this

> I'm quite curious to see how people answer these questions.  I know how
> various people here at The MathWorks would answer them, and I'd like to

I definitely don't think "==" should be modified underneath, either,
certainly not transparently.

Elemental Intrinsic Function (Generic): Returns the nearest different
number (representable on the processor) in a given direction.

I'd probably vote for something like (or similar to) the F90/95
intrinsic NEAREST()

result = NEAREST (x, s)

x
(Input) Must be of type real.

s
(Input) Must be of type real and nonzero.

Results:

The result type is the same as x. The result has a value equal to the
machine representable number that is different from and nearest to x, in
the direction of infinity, with the same sign as s.

Hi,

Quote:> The OP's problem wouldn't be solved by this, since he was
> doing a ">=" test. You'd need to similarly fuzzify >=
> and <= for doubles. I don't know if I'd want that without
> my knowledge.

> - Randy

I agree with all of you that vote for leaving "==" unchanged. But I
also find I have to think twice at many similar occasions. Maybe
Mathworks/CSSM could come up with an efficient and well rounded
function. Perhaps something like

function yesno=fuzzcomp(a,b,op,fuzz)

if nargin<2,
error 'FUZZCOMP requires at least two arguments')
elseif nargin<3,
op='==';
tol=......;
elseif nargin<4,
tol=......;
end;%if

if size(a,2)==1 & size(b,2)==2
if isempty(findstr('=',op)),
yesno=fuzzcomp(a,b(:,1),'>',tol) &
fuzzcomp(a,b(:,2),'<',tol);
else
yesno=fuzzcomp(a,b(:,1),'>=',tol) &
fuzzcomp(a,b(:,2),'<=',tol);
end;%if
else
switch op
case '=='
yesno=abs(a-b)<tol;
case '<='
yesno=a<b+tol;
case '>='
yesno=a>b-tol;
case '<'
yesno=a<b-tol;
case '>'
yesno=a>b+tol;
case '~='
yesno=~abs(a-b)<tol;
otherwise
error('FUZZCOMP: unknown relational operator'
end; %switch
end;%if

this has not been tested at all, and was not considered in much
detail, rather an idea with some 'metacode' as an illustration.
Anyway, if there is something to this idea, chances are that it will
quickly polished by you gurus out there.

If so, it could be included in the documentation and emphasised
enough to give all newcomers (I was there not long ago) a useful tool

my 0.02c...

Regards,
Lars

PS. Maybe a better name would be floatcomp or something like that

hi, all

I can't repeat your results of the tests about 0.3. For example,
x = [0.1:0.1:0.4];
x(3) == 0.3

ans =

1

format long;
x-0.3

ans =

Columns 1 through 4

-0.20000000000000 -0.10000000000000 0
0.10000000000000

Column 5

0.20000000000000

so I wonder which version of matlab you are using? I'm using v6.5

[snip]

> Now, follow-up questions:

[snip]

> Alternately, would an "approximately equal" operator be useful?
> [It's
> easy to do using ABS and <= ... but would you use it if
something
> like this

> I'm quite curious to see how people answer these questions. I know
> how
> various people here at The MathWorks would answer them, and I'd
> like to

To keep the heat on someone have to opose...

I would find such an operator handy. I would prefer to use that kind
of operator compared to the more elaborate writing of
abs((a-b)./(a/2+b/2))<3*eps.

Would it lead to random hard tracked bugs in my programs? I dont
know. Would the code be a little more hard or easy to read?

Then in most field I guessx we're using 'much larger/less than' that
could be the next step?

This should be simple and i really need help.

My project consists of a list of healthy/failing companies.
Each company has 14 parameters that should help determine whether the
company is good or bad. i also have a vector of results (1 or 0) which
shows if the company is good or bad.
how do i use the correlation function in order to find which
parameters are redundant (have a high correlation with another