The questions here are not actually settled in any one or two lines of
: >Todd -- Your reply to Jerry's question below is very informative. I have a
: >question: how do you interpret these items in a multivariate linear
: >regression? For example, say one coefficient's p-value is .000 while
: >variable's is .500. Does this imply that my entire model is at risk, or
: >should I just focus my analysis on those variables with low p-values?
: >for any replies...
- What model? What purpose? A coefficient's test is the on the
contribution of that variable, when it is entered LAST. Two great
variables, not quite redundant, may give you great robustness for
replication, but the p-value could be NS for both of them -- and
either one will look good once you remove the other.
: Your model is sort of okay as long as the p-value for the
: F-statistics is significant. After all, the p-values of model
: and coefficients depend on many factors, some are a) the
: covariances with variables you've already included, b) the
: sample size, c) the type of regression you've chosen.
: If you don't chose dependend variables by divine guidance, but
: use a theoretical model (e.g. in economics), it's perfectly okay
: to include variables with p-values of 0.5, if you like to do so.
There should be a REASON for having a variable in a model, and that
should be more potent than the value of the p-value for one set of
: If you don't know which variables you should include, use method
: Back-Elimination. Starting with the saturated model (including
: all possible explanatory variables), this eliminates the least
: significant variables step by step. This doesn't always yield
: good results, so have a second look at the final model. Model
: selection isn't easy, after all.
- Stepwise is generally a bad idea, whether it is step-up or
step-down. With potent predictors, YES, it can get you a shorter
model. Do you need a shorter model? With hundreds of extra
variables being tested, which are not very potent, it can get you a
model that will perform worse than *chance* if it ignores
multiple, intercorrelated real predictors in favor of the variables
that reach ".05" by happenstance. See articles in my FAQ for
http://www.pitt.edu/~wpilib/statfaq.html Univ. of Pittsburgh