As an example of my data, I have GROUP 1 with three rows of data, and GROUP 2 with two rows of data, in a data frame:
GROUP VARIABLE 1 VARIABLE 2 VARIABLE 3
1 2 6 5
1 4 NA 1
1 NA 3 8
2 1 NA 2
2 9 NA NA
I would like to sample a single variable, per column from GROUP 1, to make a new row representing GROUP 1. I do not want to sample one single and complete row from GROUP 1, but rather the sampling needs to occur individually for each column. I would like to do the same for GROUP 2. Also, the sampling should not consider/include NA's, unless all rows for that group's variable have NA's (such as GROUP 2, VARIABLE 2, above).
For example, after sampling, I could have as a result:
GROUP VARIABLE 1 VARIABLE 2 VARIABLE 3
1 4 6 1
2 9 NA 2
Only GROUP 2, VARIABLE 2, can result in NA
here. I actually have 39 groups, 50,000+ variables, and a substantial number of NA
. I would sincerely appreciate the code to make a new data frame of rows, each row having the sampling results per group.