Analysing multiple lines of data per subject

Analysing multiple lines of data per subject

Post by Philip Feel » Wed, 16 Jul 2003 07:03:10



I generally do analysis of data for dentistry researchers and now have a new
dataset with multiple lines of treatment data per patient. I need to check for
treatments done before and after other treatments over a period of time (dates
are in the data). I have 6.9 million (yes!) lines of data but only 8
variables.

I've looked at the mult response command, but this doesn't seem to do what I
want. Besides, this has a limit in the number of rows and columns it produces
and I think I would exceed that very rapidly.

I have limited macro knowledge and would appreciate any help I can get on
this.

Thanks!
Phil

 
 
 

Analysing multiple lines of data per subject

Post by Rich Ulric » Fri, 18 Jul 2003 06:09:12




Quote:> I generally do analysis of data for dentistry researchers and now have a new
> dataset with multiple lines of treatment data per patient. I need to check for
> treatments done before and after other treatments over a period of time (dates
> are in the data). I have 6.9 million (yes!) lines of data but only 8
> variables.

> I've looked at the mult response command, but this doesn't seem to do what I
> want. Besides, this has a limit in the number of rows and columns it produces
> and I think I would exceed that very rapidly.

One of the hardest things to do for a large data set is to
figure out what to do across a hierarchy -- multiple dates for
one person;  possibly multiple Episodes, each with multiple dates,
and each date with multiple records.... for each person.

If you have an important purpose, you might pay big money
for a special database to combine records in the way that you
need, and still let you do tabulations.  Shoot, you might pay
big money for that simpler ability to *look*  at the records
in an organized way.  I think my hospital system pays an
annual fee that is at least a fraction of a million dollars.

So, you won't find a general database for  that can handle
all the desirable organization and *also*  do complicated
statistics.  

Quote:

> I have limited macro knowledge and would appreciate any help I can get on
> this.

You will probably need to use "aggregate"  to create the new
records that you want.  "lag"  is also useful, where you want to
keep one value (or accumulate it)  from one record to the next.
But you have to figure out, on your own, what it is that you
can put into the new records.

What do you want to see counted?  What needs to be
cross-tabulated?

 - sorry that I can't offer something more promising.
--

http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization."  Justice Holmes.

 
 
 

1. spreadsheet has multiple lines per subject

Q.  How does one use an SPSS data spreadsheet that has multiple lines per
subject?

Specifically, each subject's data has 25 lines, each containing 5 variables.
The first variable in each line names the condition (stimulus) used to generate
the rest of the data on that line.  If this data file was a TEXT file, I would
write data list syntax, including the specifier "RECORDS = 25", and define each
variable by line "/1.../2.../3.../25" and column position (using fixed format),
or by serial position (using free field format).   However, I can find no way to
do this with a spreadsheet data file.  When it comes to spreadsheets, the only
format I have ever used lists all data on a single line per subject.  Is there
some command one uses to tell SPSS how many lines per subject there are in a
spreadsheet?

In case it makes a difference to your answer, the type of analyses I want to
involve scale development -- factor analysis, internal consistency reliability,
factor analysis, creation of composite variables (summing items on different
lines), etc.

If there is no simple solution to this question, what is the easiest way to
collapse the 25 lines per subject into one line per subject (with all 125
variables on that single line)?

Thanks for any help with this.

John Poole

--
***************************************
John H. Poole, Ph.D.
Department of Psychiatry
University of California Medical Center
4150 Clement Street (116C)
San Francisco, CA 94061, USA

Phone: 650-281-8851   Fax: 415-750-6996

***************************************

2. Bug in "Available Bandwidth" calculation?

3. multiple data lines per record

4. Urgent requirement!

5. Plotting multiple lines of data on one line in a graph

6. Commodore please RESPOND! re:8000 0003

7. Creating multiple lines of data (a database) from one "control" line for a budget

8. Office XP

9. SPSS spreadsheets, several data records per subject

10. transforming multiple records per id to one record per id

11. Lines-per-inch / dots-per-inch

12. Per-IP data transfer tracking on machine with multiple IP addresses

13. Rearranging Data File With Multiple Records per Observation?