Warren,

The MIXED procedure can handle multivariate repeated measures. I am

not totally clear on the design of the experiment (are there only

two plants species for the entire experiment, or does each plot have

two plant types, with different plots having different combinations

of plant species?). The experiment would be most easily analyzed

if there were only teo species in the entire experiment, but some

variant of the analysis presented below would extend to experiments

with more than one species.

As I understand the design, you have random herbivories. Nested

within herbivories, you have several ponds (the number of ponds may

vary across herbivories). Nested within ponds within herbivory are

plots. Within each plot you have measures of proportion of plot

coverage for each of the two species as well as proportion of open

water. One question here: will these proportions sum to 1.0? If

so, then you can only employ two of these three variables, the third

being a linear combination of the other two. If other plant species

may find their way into a plot, so that the sum of the three

proportions is not always 1.0, then you may use all three variables

as responses.

Now, let me assume that the sum of the three proportions would not

total 1.0. Let me further assume that there are only two plant

species with which the experiment is concerned so that every plot

is identical in experimental design. Now, construct a vector

response and indicator variable as

Response Indicator Observation

prop(species 1) 1 (herbivory 1, pond 1, plot 1, time=1)

prop(species 2) 2 (herbivory 1, pond 1, plot 1, time=1)

prop(open H2O) 3 (herbivory 1, pond 1, plot 1, time=1)

prop(species 1) 1 (herbivory 1, pond 1, plot 2, time=1)

prop(species 2) 2 (herbivory 1, pond 1, plot 2, time=1)

prop(open H2O) 3 (herbivory 1, pond 1, plot 2, time=1)

prop(species 1) 1 (herbivory 1, pond 1, plot 3, time=1)

prop(species 2) 2 (herbivory 1, pond 1, plot 3, time=1)

prop(open H2O) 3 (herbivory 1, pond 1, plot 3, time=1)

prop(species 1) 1 (herbivory 1, pond 2, plot 1, time=1)

prop(species 2) 2 (herbivory 1, pond 2, plot 1, time=1)

prop(open H2O) 3 (herbivory 1, pond 2, plot 1, time=1)

prop(species 1) 1 (herbivory 1, pond 2, plot 2, time=1)

prop(species 2) 2 (herbivory 1, pond 2, plot 2, time=1)

prop(open H2O) 3 (herbivory 1, pond 2, plot 2, time=1)

prop(species 1) 1 (herbivory 1, pond 2, plot 3, time=1)

prop(species 2) 2 (herbivory 1, pond 2, plot 3, time=1)

prop(open H2O) 3 (herbivory 1, pond 2, plot 3, time=1)

...

I don't know what covariates you might have for each observation nor

whether you have treatments which you apply to the different plots,

so the code below ignores those effects. We have just a multivariate

repeated measures intercept only model for each of the three responses.

If you have treatments or covariates which could affect the proportion

of each of the three responses over time, then those variables should

also be included not just as main effects, but as interactions with

species, time, and species by time interaction. Herewith, the code

which would account for the different correlation structures in your

data.

proc format; /* Just setting up a format which will aid */

value indic /* model interpretation. */

1 = 'Species 1'

2 = 'Species 2'

3 = 'Open H2O';

run;

proc mixed data=...;

class herbivory pond plot indicator time;

model response = indicator|time;

/* Note that this fixed effect is just an intercept for */

/* each species at any given time across all herbivories, */

/* ponds, and plots. */

random herbivory pond(herbivory) / group=indicator*time;

/* Note that intercepts are random for each herbivory and */

/* for each pond within a herbivory. I allow the variance */

/* of the herbivory effect to be separately estimated for */

/* each of the two species and for open water, as well as */

/* over time. Hopefully, you have enough */

/* observations to estimate the different variances and get */

/* reasonably stable estimates. You might find that 1) you */

/* don't have enough data to estimate different variances */

/* at each time (in which case you should see no structure */

/* to the variances, that is the variances vor a given */

/* species bounce around), or 2) that you do have sufficient*/

/* data to estimate variances and the variances are fairly */

/* stable with respect to time. In either of these cases, */

/* might then take time out of the group specification. */

/* You should be able to get a likelihood ratio test of the */

/* importance of time in the group specification by fitting */

/* the random effects specification shown above, and */

/* fitting a random effects specification with time removed */

/* from the group specification, comparing the difference */

/* in -2LL to a chi-square distribution with df the */

/* difference in the number of variances estimated by the */

/* two models. Similar comments apply to the plot effects. */

/* Within plot effects which reflect competition between */

/* species and correlations across time are taken into */

/* account in the repeated statement given next. */

repeated indicator*time / subject=plot(pond*herbivory)

type=un@un; /* or un@cs or un@ar*/

lsmeans indicator*time / diff;

/* lsmeans statement would allow you to test species 1 vs 2 */

/* proportions at time 1, time2, time3, as well as species */

/* 1 difference in proportions at time 1 vs time 2, at time */

/* 1 vs time 3, at time 2 vs time 3, etc. You could set up */

/* other contrasts, including multiple df tests of species */

/* effects at the different times, and time effects among */

/* the different species employing contrast statements. */

/* That is too much for me to code here, and you will likely*/

/* have other variables and other contrasts to test, so I */

/* leave all that to you. But such tests can be done. */

format indicator indic.

run;

Let me make a couple of comments about your response. Proportions

are constrained to lie on the range 0 to 1. The constraint on the

range is necessarily a violation of normality assumptions upon which

proc mixed is based. There are a number of other problems with

proportions as responses, among them nonconstant variance. Similar

issues would affect your ordinal values which are derived from

underlying proportions. Now, if the proportions of each of the

responses could never go to zero or one, then proportions are better

analyzed when subjected to some transformation. The arcsin

transformation is variance stabilizing, and it allows the response

to have range negative infinity to positive infinity. I would

really urge you to analyze the responses subjected to the arcsin

transformation, at least for purposes of testing. You can back

transform lsmeans for estimation purposes to get estimates on the

original range if you need to be able to interpret the results. I

would dismiss the ordinal response variable for this problem. It

throws away information and is, in fact, harder to analyze and harder

to interpret.

There are other transformations which can be used for proportions.

Draper and Smith also mention ln(Y/(1-Y)) which has the advantage

that the parameter estimates can be interpreted as log odds, but the

transformation is not variance stabilizing, so you must perform a

weighted regression. If the area used to determine the proportions

of the three responses is the same across all plots, then Y(1-Y)

would be the weight variable. If a different area is employed for

different plots, then the weight variable would be the Y(1-Y)

multiplied by plot area sampled.

Best wishes,

Dale

--- Warren Schlechte <wschl...@KTC.COM> wrote:

> Does SAS have a procedure that allows one to analyze the following?

> I have a nested-factorial design (Ponds within nested within

> herbivory,

> ponds crossed with plant type. Each pondXplant has 3 replicate

> plots. Two

> species of each plant are planted within each plot). Within each

> plot, 3

> measures will be taken, these measures are of an ordinal scale (0 to

> 7,

> roughly corresponding to the log-odds). Each measure will describe

> the

> proportion of a specific plant (or open water) within each plot. For

> example, within each plot there are 2 species of plant and open water

> (3

> categories), if each occupied 33% of the plot, we would record 3, 3,

> 3 for

> each category within that plot. Each plot will be measured each

> quarter for

> 2 years, suggesting a repeated measures component. The questions of

> interest relate to species composition within plots through time

> based on

> competition, and treatment differences associated with herbivory.

> If I had a single measure for each plot, I would think in terms of

> Proc

> MIXED, but I have 3 measures per plot per pond per time. The

> relatedness of

> the measures within the plot would seem to negate MIXED, but may not

> if

> someone could advise me. Because I have mutually exclusive and

> exhaustive

> categories within each plot, I thought maybe CATMOD would be the

> procedure

> to use, but my measures are not multinomial, they are ordinal. I

> could use

> GLM, but am concerned because it appears I have less ability to alter

> the

> covariance matrix.

> Any suggested directions would be appreciated.

> Warren Schlechte

> Heart of the Hills Research Station

...

**read more »**