> I generally do analysis of data for dentistry researchers and now have a new
> dataset with multiple lines of treatment data per patient. I need to check for
> treatments done before and after other treatments over a period of time (dates
> are in the data). I have 6.9 million (yes!) lines of data but only 8
> I've looked at the mult response command, but this doesn't seem to do what I
> want. Besides, this has a limit in the number of rows and columns it produces
> and I think I would exceed that very rapidly.
One of the hardest things to do for a large data set is to
figure out what to do across a hierarchy -- multiple dates for
one person; possibly multiple Episodes, each with multiple dates,
and each date with multiple records.... for each person.
If you have an important purpose, you might pay big money
for a special database to combine records in the way that you
need, and still let you do tabulations. Shoot, you might pay
big money for that simpler ability to *look* at the records
in an organized way. I think my hospital system pays an
annual fee that is at least a fraction of a million dollars.
So, you won't find a general database for that can handle
all the desirable organization and *also* do complicated
> I have limited macro knowledge and would appreciate any help I can get on
You will probably need to use "aggregate" to create the new
records that you want. "lag" is also useful, where you want to
keep one value (or accumulate it) from one record to the next.
But you have to figure out, on your own, what it is that you
can put into the new records.
What do you want to see counted? What needs to be
- sorry that I can't offer something more promising.
"Taxes are the price we pay for civilization." Justice Holmes.