This should be simple and i really need help.

My project consists of a list of healthy/failing companies.
Each company has 14 parameters that should help determine whether the
company is good or bad. i also have a vector of results (1 or 0) which
shows if the company is good or bad.
how do i use the correlation function in order to find which
parameters are redundant (have a high correlation with another

Thanks for your help.

I didn't answer this last time because I don't think I really understand the
problem

There's a corrcoef function that you can use to compute correlations.  It
will tell you which parameters (which I assume are columns of a matrix) are
highly correlated with which other parameters.

There are two things this won't do.  The first is tell you which parameters
are highly correlated with combinations of other parameters.  That is,
having decided already to use X1 and X2, how correlated is X3 with them?
There are things called "partial correlation" and "canonical correlation"
that may be helpful here.

The second thing this won't do is give you any information about adding
"more information."  It could be that X2 is correlated with X1, and yet X2
does add information beyond what X1 provides in determining whether the
result is good or bad.

Nowadays 14 is not such a big number, and computers are fast.  What I might
do if I had the Statistics Toolbox is use the glmfit function to perform
logistic regression on the binary good/bad variable using the 14 parameters
as predictors.  I might add them all, then consider dropping the one that
looked least significant and fitting again, and repeat as necessary.

I hope that gives you some ideas.

-- Tom

Quote:> This should be simple and i really need help.

> My project consists of a list of healthy/failing companies.
> Each company has 14 parameters that should help determine whether the
> company is good or bad. i also have a vector of results (1 or 0) which
> shows if the company is good or bad.
> how do i use the correlation function in order to find which
> parameters are redundant (have a high correlation with another

> Thanks for your help.

Hi,

I'm working on my dissertation and I have come across a crazy quirk
in MATLAB that I was willing to write off before but has happened
twice now and it's REALLY starting to make me angry.

I am trying to group some row vectors based on the value in their
first column. Simple enough, right? Here's the snippet of code I'm
working with:

now, when I run the program, i get the following outputs:

ans =
0.3000 0.3000

ans =
0.3000 0.3000

ans =
0.2000 0.3000

OK!

ans =
0.4000 0.3000

and so on. The correct value of ratio NEVER sets off the condition,
but the incorrect value always does. The plot thickens:

ans =
0.2000 0.2000

OK!

ans =
0.4000 0.4000

OK!

I have tried everything I can think of, and as ridiculous as it may
seem, it appears to me that some MATLAB functions are actually
incapable of recognizing the number 0.3 correctly. As I mentioned, I
had a similar problem about a month ago, with a very similar line of
code and again, the problem value was 0.3. I have messed around with
some other commands using that value, like adding and subtracting it,
using find to seek out that value in a matrix, and it works just
fine.

Please help me, if anyone has ever heard of such an unusual problem,
please let me know. I haven't seen it listed in any troubleshooting
notices or bug reports. I am not an idiot, I have been programming
MATLAB for about two years so I'm pretty confident in my abilities,
and I am positive I have not made a mistake. I am becoming EXTREMELY
angry, though, like I don't have enough else to worry about without a
piece of mathematical software that CAN'T RECOGNIZE A NUMBER!!
Maybe it's even a processor problem? Any help would be greatly
appreciated. Thanks.

Mark