comparing strength of effects between models (different dependent var, same independent var)

comparing strength of effects between models (different dependent var, same independent var)

Dear newsgroup members,

We have a problem comparing the strength of regression coefficients of the
same independent variable in models with different dependent variables.
Below I will show two (related) examples:

Example 1.
Cross-sectional design. Dependent variables: two speed tasks (number of
seconds needed is recorded; the lower, the better, the faster) and tasks
differ in complexity/difficulty. Independent variable: age (0: young and 1:
old). Question: do older people particularly differ from younger people in
the difficult task (more so than in the easy task)?

Example 2.
Cross-sectional design. Dependent variables: one speed task (number of
seconds needed is recorded; the lower, the better, the faster) and a memory
task (number of words reproduced after presentation; the higher, the better
(the memory)). Independent variable: age (0: young and 1: old). Question: do
older people particularly differ from younger people in the speed task (more
so than in the easy task)?

How can these effects of age be compared? In the first example both
dependent variables are in seconds, in the second the scales differ
completely. It seems to me that standardisation of at least the dependent
variable is needed? But how to test for a significant difference in the
effect of age.

Some have suggested MANOVA or GLM repeated measures designs. Is this the way
to go? And, foremost, should at least the dependent variable be standardised
then?

Thank you for any information.

Best regards,

Hans Bosma

comparing strength of effects between models (different dependent var, same independent var)

Hi

> We have a problem comparing the strength of regression
> coefficients of the same independent variable in models with
> different dependent variables. Below I will show two
> (related) examples:
> Example 1.
> Cross-sectional design. Dependent variables: two speed tasks
> (number of seconds needed is recorded; the lower, the better,
> the faster) and tasks differ in complexity/difficulty.
> Independent variable: age (0: young and 1: old). Question: do
> older people particularly differ from younger people in the
> difficult task (more so than in the easy task)?

I'm not sure I see the overall difficulty here.  Analysis of
variance (or regression equivalent) to analyze the effects of
task, age, and their interaction on RT.  If all people performed
both tasks, then task is within-subject factor, otherwise,
between-subjects along with age.  In either case, significant
interaction would indicate greater difference between ages for
one of the tasks.  Examination of means and perhaps simple
effects of age for 2 tasks would inform the exact conclusion.
One complication is if the two tasks have dramatically different
RTs, in which case you would have to consider whether absolute or
proportional differences were of interest (e.g., significant
interaction could occur for difference of 15 sec vs. 20
sec for easy task, a 33% difference, vs. difference of 30 sec
vs. 40 sec for difficult task, still a 33% difference).  Ratio
scores could be calculated if task is within-subjects.

Quote:> Example 2.
> Cross-sectional design. Dependent variables: one speed task
> (number of seconds needed is recorded; the lower, the better,
> the faster) and a memory task (number of words reproduced
> after presentation; the higher, the better (the memory)).
> Independent variable: age (0: young and 1: old). Question: do
> older people particularly differ from younger people in the
> speed task (more so than in the easy task)?

Separate anovas would indicate if significant differences
occurred for speeded task and not for memory task, perhaps the
easiest case to make.  If age affected both, there are tests for
the significance of differences between correlation coefficients
(see Quinn McNemar's classic text) that could be used.  Since r^2
reflects differences between means relative to total variability
in y, this would seem to parallel the idea of an interaction.

Quote:> Some have suggested MANOVA or GLM repeated measures designs.
> Is this the way to go? And, foremost, should at least the
> dependent variable be standardised then?

Repeated measures would appear to apply to 1st design, assuming
task within-subjects.  So, something like

MANOVA easy diff BY age(1 2) /WSF task(2) /PRINT=CELL
/WSD /DESIGN

The 1st anova is the default factorial.  Look for a significant
interaction.  The second is the simple effects of age at each
level of task. (I'm not certain whether to include TASK on the
WSD line for the 2nd design, hence the ???. I find SPSS can do
some weird things when all desired effects are listed for simple
effects).

The interaction above is equivalent to testing significance of
difference between difference scores, that is,

COMPUTE differ = diff - easy
MANOVA differ BY age(1 2) /PRINT = CELL

This equivalence suggests that to address the ratio question, one
could do something like,

COMPUTE ratio = diff/easy
MANOVA ratio BY age(1 2) /PRINT = CELL

For your 2nd situation (speed vs. recall), I would stick with the
difference between rs, because this standardizes the effects.

Best wishes
Jim

============================================================================
James M. Clark                          (204) 786-9757
Department of Psychology                (204) 774-4134 Fax
University of Winnipeg                  4L05D

============================================================================

comparing strength of effects between models (different dependent var, same independent var)

Quote:> Dear newsgroup members,

> We have a problem comparing the strength of regression coefficients of the
> same independent variable in models with different dependent variables.
> Below I will show two (related) examples:

> Example 1.
> Cross-sectional design. Dependent variables: two speed tasks (number of
> seconds needed is recorded; the lower, the better, the faster) and tasks
> differ in complexity/difficulty. Independent variable: age (0: young and 1:
> old). Question: do older people particularly differ from younger people in
> the difficult task (more so than in the easy task)?

Jim gave a good answer.  He included the prospect that
you have to worry about "scaling":  for instance, if means
are very different, there could be a proportional difference,
EQUAL  in logic but an interaction if analyzed wrong.

However, there is also the possibility that the tests are
notably different in their intrinsic reliabilities and validities.
I expect this sort of difficulty would be minor when the
tests are approximately parallel.  But it would be a much
larger possibility that has to be taken into account -- you
need additional statements on reliability, etc. -- when the
tests are  different, as in example 2.

Quote:

> Example 2.
> Cross-sectional design. Dependent variables: one speed task (number of
> seconds needed is recorded; the lower, the better, the faster) and a memory
> task (number of words reproduced after presentation; the higher, the better
> (the memory)). Independent variable: age (0: young and 1: old). Question: do
> older people particularly differ from younger people in the speed task (more
> so than in the easy task)?

[ snip, detail ]

--

http://www.pitt.edu/~wpilib/index.html

comparing strength of effects between models (different dependent var, same independent var)

> Hi

> > We have a problem comparing the strength of regression
> > coefficients of the same independent variable in models with
> > different dependent variables. Below I will show two
> > (related) examples:

> > Example 1.
> > Cross-sectional design. Dependent variables: two speed tasks
> > (number of seconds needed is recorded; the lower, the better,
> > the faster) and tasks differ in complexity/difficulty.
> > Independent variable: age (0: young and 1: old). Question: do
> > older people particularly differ from younger people in the
> > difficult task (more so than in the easy task)?

> I'm not sure I see the overall difficulty here.  Analysis of
> variance (or regression equivalent) to analyze the effects of
> task, age, and their interaction on RT.  If all people performed
> both tasks, then task is within-subject factor, otherwise,
> between-subjects along with age.  In either case, significant
> interaction would indicate greater difference between ages for
> one of the tasks.  Examination of means and perhaps simple
> effects of age for 2 tasks would inform the exact conclusion.
> One complication is if the two tasks have dramatically different
> RTs, in which case you would have to consider whether absolute or
> proportional differences were of interest (e.g., significant
> interaction could occur for difference of 15 sec vs. 20
> sec for easy task, a 33% difference, vs. difference of 30 sec
> vs. 40 sec for difficult task, still a 33% difference).  Ratio
> scores could be calculated if task is within-subjects.

Thanks Jim for your knowledge and tips, but, although both speed tasks are
in seconds, they refer to somewhat different cognitive domains. Does this
not make Example 1 in a way similar to Example 2 (where seconds should be
compared with reproduced words reproduced)? And, does this not imply that in
both examples a MANOVA could be considered, but only after standardising all
cognitive tests. I mean, a MANOVA could be used to compare effects of age on
any set of outcomes, but what does the interaction between task (outcome)
and age tell us without standardisation? The problems of choosing between
absolute and proportional differences also seems absent, after having
standardised both tasks? Both then have mean 0 and sd 1. But maybe I am
confused here.

- Show quoted text -

Quote:> > Example 2.
> > Cross-sectional design. Dependent variables: one speed task
> > (number of seconds needed is recorded; the lower, the better,
> > the faster) and a memory task (number of words reproduced
> > after presentation; the higher, the better (the memory)).
> > Independent variable: age (0: young and 1: old). Question: do
> > older people particularly differ from younger people in the
> > speed task (more so than in the easy task)?

> Separate anovas would indicate if significant differences
> occurred for speeded task and not for memory task, perhaps the
> easiest case to make.  If age affected both, there are tests for
> the significance of differences between correlation coefficients
> (see Quinn McNemar's classic text) that could be used.  Since r^2
> reflects differences between means relative to total variability
> in y, this would seem to parallel the idea of an interaction.

Jim, yes, age affects both tasks significantly, but which effect is
stronger? I do not know the Quinn McNemar test, but reading your lines, it
seems to apply to correlations only or does it have to do with the R-squares
between both models. It somehow seems that comparing R-squares might also be
a alternative or additional procedure for Example 1? But, foremost, how can
I test for statistically significant differences between R-squares for
non-nested models?

Thank you for any (further) information.

Hans

- Show quoted text -

Quote:> > Some have suggested MANOVA or GLM repeated measures designs.
> > Is this the way to go? And, foremost, should at least the
> > dependent variable be standardised then?

> Repeated measures would appear to apply to 1st design, assuming
> task within-subjects.  So, something like

> MANOVA easy diff BY age(1 2) /WSF task(2) /PRINT=CELL
>   /WSD /DESIGN

> The 1st anova is the default factorial.  Look for a significant
> interaction.  The second is the simple effects of age at each
> level of task. (I'm not certain whether to include TASK on the
> WSD line for the 2nd design, hence the ???. I find SPSS can do
> some weird things when all desired effects are listed for simple
> effects).

> The interaction above is equivalent to testing significance of
> difference between difference scores, that is,

> COMPUTE differ = diff - easy
> MANOVA differ BY age(1 2) /PRINT = CELL

> This equivalence suggests that to address the ratio question, one
> could do something like,

> COMPUTE ratio = diff/easy
> MANOVA ratio BY age(1 2) /PRINT = CELL

> For your 2nd situation (speed vs. recall), I would stick with the
> difference between rs, because this standardizes the effects.

> Best wishes
> Jim

============================================================================
> James M. Clark (204) 786-9757
> Department of Psychology (204) 774-4134 Fax
> University of Winnipeg 4L05D

============================================================================

- Show quoted text -

comparing strength of effects between models (different dependent var, same independent var)

Hi

> > Hi

> > > We have a problem comparing the strength of regression
> > > coefficients of the same independent variable in models with
> > > different dependent variables. Below I will show two
> > > (related) examples:

> > > Example 1.
> > > Cross-sectional design. Dependent variables: two speed tasks
> > > (number of seconds needed is recorded; the lower, the better,
> > > the faster) and tasks differ in complexity/difficulty.
> > > Independent variable: age (0: young and 1: old). Question: do
> > > older people particularly differ from younger people in the
> > > difficult task (more so than in the easy task)?

> > I'm not sure I see the overall difficulty here.  Analysis of
> > variance (or regression equivalent) to analyze the effects of
> > task, age, and their interaction on RT.  If all people performed
> > both tasks, then task is within-subject factor, otherwise,
> > between-subjects along with age.  In either case, significant
> > interaction would indicate greater difference between ages for
> > one of the tasks.  Examination of means and perhaps simple
> > effects of age for 2 tasks would inform the exact conclusion.
> > One complication is if the two tasks have dramatically different
> > RTs, in which case you would have to consider whether absolute or
> > proportional differences were of interest (e.g., significant
> > interaction could occur for difference of 15 sec vs. 20
> > sec for easy task, a 33% difference, vs. difference of 30 sec
> > vs. 40 sec for difficult task, still a 33% difference).  Ratio
> > scores could be calculated if task is within-subjects.

> Thanks Jim for your knowledge and tips, but, although both
> speed tasks are in seconds, they refer to somewhat different
> cognitive domains. Does this not make Example 1 in a way
> similar to Example 2 (where seconds should be compared with
> reproduced words reproduced)? And, does this not imply that
> in both examples a MANOVA could be considered, but only after
> standardising all cognitive tests. I mean, a MANOVA could be
> used to compare effects of age on any set of outcomes, but
> what does the interaction between task (outcome) and age tell
> us without standardisation? The problems of choosing between
> absolute and proportional differences also seems absent,
> after having standardised both tasks? Both then have mean 0
> and sd 1. But maybe I am confused here.

I do not see any problem comparing RTs for different tasks or
conditions.  It is done all the time in cognitive psychology,
with the proviso about different baselines being an important
one.  Also, you have not specified what you mean by
standardizing.  Clearly, if you separately standardized all 4
groups (young/easy, young/diff, old/easy, old/diff), you would
eliminate all effects.  You could try standardizing easy and
difficult tasks, even across ages, but I wonder whether the
resulting interaction between age and task would be any different
than that obtained from analyzing the raw data.  It is also not
clear to me whether standardizing would handle the absolute
vs. proportional issue or simply mask it.  That is, just because
you have subtracted out the average RTs does not eliminate the
fact that there were differences in those average RTs.

- Show quoted text -

Quote:> > > Example 2.
> > > Cross-sectional design. Dependent variables: one speed task
> > > (number of seconds needed is recorded; the lower, the better,
> > > the faster) and a memory task (number of words reproduced
> > > after presentation; the higher, the better (the memory)).
> > > Independent variable: age (0: young and 1: old). Question: do
> > > older people particularly differ from younger people in the
> > > speed task (more so than in the easy task)?

> > Separate anovas would indicate if significant differences
> > occurred for speeded task and not for memory task, perhaps the
> > easiest case to make.  If age affected both, there are tests for
> > the significance of differences between correlation coefficients
> > (see Quinn McNemar's classic text) that could be used.  Since r^2
> > reflects differences between means relative to total variability
> > in y, this would seem to parallel the idea of an interaction.

> Jim, yes, age affects both tasks significantly, but which
> effect is stronger? I do not know the Quinn McNemar test, but
> reading your lines, it seems to apply to correlations only or
> does it have to do with the R-squares between both models. It
> somehow seems that comparing R-squares might also be a
> alternative or additional procedure for Example 1? But,
> foremost, how can I test for statistically significant
> differences between R-squares for non-nested models?

McNemar's text (Psychological Statistics, 4th ed., 1969, pp.
157ff) discusses a number of statistical tests (some attributed
to Fisher) for testing significance of differences between rs
(also some for differences between regression coefficients).
Assuming that you have data from the same people on the different
tasks (whether ex. 1 or 2), then you have 3 correlations,

r12 = r(age, depvar1)
r13 = r(age, depvar2)
r23 = r(depvar1, depvar2)

Formula 10.7 in McNemar (p. 158) is:

t =     [ (r12-r13) sqrt( (N-3)(1+r23) ) ]
----------------------------------
sqrt( 2*(1-r12^2-r13^2-r23^2+2*r12*r13*r23) )

this t has df = N-3

In interpreting this, it is important to remember that r12 and
r13 are measures of the difference between the means of the two
groups relative to the variability within groups (e.g., the ts
for these rs are equivalent to the ts or sqrt(F) for the
differences between the means).  Thus, a significant difference
between r12 and r13 would appear to represent a difference in the
magnitude/strength of the differences you want to compare.  The
test would appear to apply to both Ex. 1 and Ex. 2, and it might
be interesting to compare this to the interaction term suggested
earlier or your comparison of standardized (within tasks) scores.

McNemar elsewhere notes that the test for significance of the
difference between rs is NOT equivalent to the test for the
difference between regression coefficients, but he only gives a
regression test for coefficients based on independent samples
(i.e., a separate N1 and N2).  That would not appear to apply to
your situation where the same subjects performed the two tasks.
Perhaps someone else knows whether such a test exists for
regressions based on the same subjects.

Best wishes
Jim

============================================================================
James M. Clark                          (204) 786-9757
Department of Psychology                (204) 774-4134 Fax
University of Winnipeg                  4L05D

============================================================================

comparing strength of effects between models (different dependent var, same independent var)

> Hi

> > > Hi

> > > > We have a problem comparing the strength of regression
> > > > coefficients of the same independent variable in models with
> > > > different dependent variables. Below I will show two
> > > > (related) examples:

> > > > Example 1.
> > > > Cross-sectional design. Dependent variables: two speed tasks
> > > > (number of seconds needed is recorded; the lower, the better,
> > > > the faster) and tasks differ in complexity/difficulty.
> > > > Independent variable: age (0: young and 1: old). Question: do
> > > > older people particularly differ from younger people in the
> > > > difficult task (more so than in the easy task)?

> > > I'm not sure I see the overall difficulty here.  Analysis of
> > > variance (or regression equivalent) to analyze the effects of
> > > task, age, and their interaction on RT.  If all people performed
> > > both tasks, then task is within-subject factor, otherwise,
> > > between-subjects along with age.  In either case, significant
> > > interaction would indicate greater difference between ages for
> > > one of the tasks.  Examination of means and perhaps simple
> > > effects of age for 2 tasks would inform the exact conclusion.
> > > One complication is if the two tasks have dramatically different
> > > RTs, in which case you would have to consider whether absolute or
> > > proportional differences were of interest (e.g., significant
> > > interaction could occur for difference of 15 sec vs. 20
> > > sec for easy task, a 33% difference, vs. difference of 30 sec
> > > vs. 40 sec for difficult task, still a 33% difference).  Ratio
> > > scores could be calculated if task is within-subjects.

> > Thanks Jim for your knowledge and tips, but, although both
> > speed tasks are in seconds, they refer to somewhat different
> > cognitive domains. Does this not make Example 1 in a way
> > similar to Example 2 (where seconds should be compared with
> > reproduced words reproduced)? And, does this not imply that
> > in both examples a MANOVA could be considered, but only after
> > standardising all cognitive tests. I mean, a MANOVA could be
> > used to compare effects of age on any set of outcomes, but
> > what does the interaction between task (outcome) and age tell
> > us without standardisation? The problems of choosing between
> > absolute and proportional differences also seems absent,
> > after having standardised both tasks? Both then have mean 0
> > and sd 1. But maybe I am confused here.

> I do not see any problem comparing RTs for different tasks or
> conditions.  It is done all the time in cognitive psychology,
> with the proviso about different baselines being an important
> one.  Also, you have not specified what you mean by
> standardizing.  Clearly, if you separately standardized all 4
> groups (young/easy, young/diff, old/easy, old/diff), you would
> eliminate all effects.  You could try standardizing easy and
> difficult tasks, even across ages, but I wonder whether the
> resulting interaction between age and task would be any different
> than that obtained from analyzing the raw data.  It is also not
> clear to me whether standardizing would handle the absolute
> vs. proportional issue or simply mask it.  That is, just because
> you have subtracted out the average RTs does not eliminate the
> fact that there were differences in those average RTs.

Jim, many thanks again for your tips (mentioned above and below).

We will try several ways of analysing these data and further think about

Hans

- Show quoted text -

Quote:> > > > Example 2.
> > > > Cross-sectional design. Dependent variables: one speed task
> > > > (number of seconds needed is recorded; the lower, the better,
> > > > the faster) and a memory task (number of words reproduced
> > > > after presentation; the higher, the better (the memory)).
> > > > Independent variable: age (0: young and 1: old). Question: do
> > > > older people particularly differ from younger people in the
> > > > speed task (more so than in the easy task)?

> > > Separate anovas would indicate if significant differences
> > > occurred for speeded task and not for memory task, perhaps the
> > > easiest case to make.  If age affected both, there are tests for
> > > the significance of differences between correlation coefficients
> > > (see Quinn McNemar's classic text) that could be used.  Since r^2
> > > reflects differences between means relative to total variability
> > > in y, this would seem to parallel the idea of an interaction.

> > Jim, yes, age affects both tasks significantly, but which
> > effect is stronger? I do not know the Quinn McNemar test, but
> > reading your lines, it seems to apply to correlations only or
> > does it have to do with the R-squares between both models. It
> > somehow seems that comparing R-squares might also be a
> > alternative or additional procedure for Example 1? But,
> > foremost, how can I test for statistically significant
> > differences between R-squares for non-nested models?

> McNemar's text (Psychological Statistics, 4th ed., 1969, pp.
> 157ff) discusses a number of statistical tests (some attributed
> to Fisher) for testing significance of differences between rs
> (also some for differences between regression coefficients).
> Assuming that you have data from the same people on the different
> tasks (whether ex. 1 or 2), then you have 3 correlations,

> r12 = r(age, depvar1)
> r13 = r(age, depvar2)
> r23 = r(depvar1, depvar2)

> Formula 10.7 in McNemar (p. 158) is:

> t = [ (r12-r13) sqrt( (N-3)(1+r23) ) ]
> ----------------------------------
> sqrt( 2*(1-r12^2-r13^2-r23^2+2*r12*r13*r23) )

> this t has df = N-3

> In interpreting this, it is important to remember that r12 and
> r13 are measures of the difference between the means of the two
> groups relative to the variability within groups (e.g., the ts
> for these rs are equivalent to the ts or sqrt(F) for the
> differences between the means).  Thus, a significant difference
> between r12 and r13 would appear to represent a difference in the
> magnitude/strength of the differences you want to compare.  The
> test would appear to apply to both Ex. 1 and Ex. 2, and it might
> be interesting to compare this to the interaction term suggested
> earlier or your comparison of standardized (within tasks) scores.

> McNemar elsewhere notes that the test for significance of the
> difference between rs is NOT equivalent to the test for the
> difference between regression coefficients, but he only gives a
> regression test for coefficients based on independent samples
> (i.e., a separate N1 and N2).  That would not appear to apply to
> your situation where the same subjects performed the two tasks.
> Perhaps someone else knows whether such a test exists for
> regressions based on the same subjects.

> Best wishes
> Jim

============================================================================
> James M. Clark (204) 786-9757
> Department of Psychology (204) 774-4134 Fax
> University of Winnipeg 4L05D

============================================================================

- Show quoted text -

Hello everyone,

I am writing because I was wondering what procedure I could use in SAS
or other statistical software to analyze a dependent variable that has
many zeros, if at the same time I want to use fixed effects or
clustering in the regression equation (because it is a very short
[i.e. 2-period] panel).  The fixed effects or clustering does not
allow one to use Tobit.  I saw a reference to symmetrically trimmed
least squares for this situation.  Any ideas how I could do this or an
alternative estimation for this problem?

Thank you for your help,
matt