I have a dataset consisting of variables ObservationNumber
, MeasurementNumber
, SubjectID
, and many dummy variables.
I would like to consolidate all non-zero values into one row by SubjectID GroupNumber.
Have:
ObsNum MeasurementNum SubjectID Dummy0 Dummy1 ... Dummy999
----------------------------------------------------...---------------
01 1 1 0 1 ... 0
02 2 1 0 1 ... 0
03 3 1 0 1 ... 0
04 4 1 0 0 ... 0
05 5 1 - - ... -
06 6 1 0 0 ... 0
07 1 2 1 0 ... 0
08 2 2 0 0 ... 0
09 3 2 0 1 ... 0
10 4 2 1 0 ... 0
11 4 2 0 1 ... 0
12 5 2 0 0 ... 1
13 6 2 0 0 ... 0
14 6 2 0 0 ... 1
15 6 2 0 0 ... 0
16 6 2 0 0 ... 0
17 6 2 0 1 ... 0
18 6 2 0 0 ... 0
19 6 2 0 0 ... 0
20 6 2 0 0 ... 0
21 6 2 1 0 ... 0
22 1 3 1 0 ... 0
23 2 3 0 1 ... 0
24 3 3 0 0 ... 1
25 4 3 - - ... -
26 5 3 0 0 ... 0
27 6 3 0 0 ... 0
28 1 4 - - ... -
29 2 4 0 0 ... 0
30 3 4 0 1 ... 0
31 4 4 1 0 ... 0
32 4 4 0 1 ... 0
33 4 4 0 0 ... 1
34 5 4 0 0 ... 1
35 6 4 0 1 ... 0
36 6 4 0 0 ... 1
Want:
MeasurementNum SubjectID Dummy0 Dummy1 ... Dummy999
----------------------------------------------------...---------------
1 1 0 1 ... 0
2 1 0 1 ... 0
3 1 0 1 ... 0
4 1 0 0 ... 0
5 1 - - ... -
6 1 0 0 ... 0
1 2 1 0 ... 0
2 2 0 0 ... 0
3 2 0 1 ... 0
4 2 1 1 ... 0
5 2 0 0 ... 1
6 2 1 1 ... 1
1 3 1 0 ... 0
2 3 0 1 ... 0
3 3 0 0 ... 1
4 3 - - ... -
5 3 0 0 ... 0
6 3 0 0 ... 0
1 4 - - ... -
2 4 0 0 ... 0
3 4 0 1 ... 0
4 4 1 1 ... 1
5 4 0 0 ... 1
6 4 0 1 ... 1
Each SubjectID has six measurement in which a set of dummyvariables are measured without outcome 0, 1 or missing. If a missing value occurs, all dummy variables for the respective observation are missing--and only one observation will be present in the dataset for that `MeasurementNumber.
I have tried to use the UPDATE
statement, but it seems to not be able to deal with '0' and '-'.
Is there a direct way of condensing all dummyvariables in this dataset for each SubjectID
grouped by MeasurementNumber
?