The questions here are not actually settled in any one or two lines of

answer.

: >Todd -- Your reply to Jerry's question below is very informative. I have a

: >question: how do you interpret these items in a multivariate linear

: >regression? For example, say one coefficient's p-value is .000 while

another

: >variable's is .500. Does this imply that my entire model is at risk, or

: >should I just focus my analysis on those variables with low p-values?

Thanks

: >for any replies...

- What model? What purpose? A coefficient's test is the on the

contribution of that variable, when it is entered LAST. Two great

variables, not quite redundant, may give you great robustness for

replication, but the p-value could be NS for both of them -- and

either one will look good once you remove the other.

: Your model is sort of okay as long as the p-value for the

: F-statistics is significant. After all, the p-values of model

: and coefficients depend on many factors, some are a) the

: covariances with variables you've already included, b) the

: sample size, c) the type of regression you've chosen.

: If you don't chose dependend variables by divine guidance, but

: use a theoretical model (e.g. in economics), it's perfectly okay

: to include variables with p-values of 0.5, if you like to do so.

There should be a REASON for having a variable in a model, and that

should be more potent than the value of the p-value for one set of

data.

: If you don't know which variables you should include, use method

: Back-Elimination. Starting with the saturated model (including

: all possible explanatory variables), this eliminates the least

: significant variables step by step. This doesn't always yield

: good results, so have a second look at the final model. Model

: selection isn't easy, after all.

- Stepwise is generally a bad idea, whether it is step-up or

step-down. With potent predictors, YES, it can get you a shorter

model. Do you need a shorter model? With hundreds of extra

variables being tested, which are not very potent, it can get you a

model that will perform worse than *chance* if it ignores

multiple, intercorrelated real predictors in favor of the variables

that reach ".05" by happenstance. See articles in my FAQ for

related arguments.

--

http://www.pitt.edu/~wpilib/statfaq.html Univ. of Pittsburgh