randomly sum values from rows and assign them to 2

I have a data.frame with 8 columns. One is for the list of subjects (one row per subject) and the other 7 rows are a score of either 1 or 0. This is what the data looks like:

>head(splitkscores)
  subject block3 block4 block5 block6 block7 block8 block9
1   40002      0      0      1      0      0      0      0
2   40002      0      0      1      0      0      1      1
3   40002      1      1      1      1      1      1      1
4   40002      1      1      0      0      0      1      0
5   40002      0      1      0      0      0      1      1
6   40002      0      1      1      0      1      1      1

I want to create a data.frame with 3 columns. One column for subjects. In the other two columns, one must have the sum of 3 or 4 randomly chosen numbers from each row of my data.frame (except the subject) and the other column must have the sum of the remaining values which were not chosen in the first random sample.

Help is much appreciated. Thanks in advance

标签： r random sum dataframe

2条回答

何必那么认真

2楼-- · 2019-08-13 19:03

Here's a neat and tidy solution free of unnecessary complexity (assume the input is called df):

chosen=sort(sample(setdiff(colnames(df),"subject"),sample(c(3,4),1)))
notchosen=setdiff(colnames(df),c("subject",chosen))
out=data.frame(subject=df$subject,
               sum1=apply(df[,chosen],1,sum),sum2=apply(df[,notchosen],1,sum))

In plain English: sample from the column names other than "subject", choosing a sample size of either 3 or 4, and call those column names chosen; define notchosen to be the other columns (excluding "subject" again, obviously); then return a data frame with the list of subjects, the sum of the chosen columns, and the sum of the non-chosen columns. Done.

0人赞添加讨论(0) 举报

太酷不给撩

3楼-- · 2019-08-13 19:07

I think this'll do it: [changed the way data were read in based on the other response because I made a manual mistake...]

   splitkscores <- read.table(text = "  subject block3 block4 block5 block6 block7 block8 block9
1   40002      0      0      1      0      0      0      0
2   40002      0      0      1      0      0      1      1
3   40002      1      1      1      1      1      1      1
4   40002      1      1      0      0      0      1      0
5   40002      0      1      0      0      0      1      1
6   40002      0      1      1      0      1      1      1", header = TRUE)

   df2 <- data.frame(subject = splitkscores$subject, sum3or4 = NA, leftover = NA)
   df2$sum3or4 <- apply(splitkscores[,2:ncol(splitkscores)], 1, function(x){
       sum(sample(x, sample(c(3,4),1), replace = FALSE))
     })
   df2$leftover <- rowSums(splitkscores[,2:ncol(splitkscores)]) - df2$sum3or4

   df2
     subject sum3or4 leftover
   1   40002       1        0
   2   40002       2        1
   3   40002       3        4
   4   40002       1        2
   5   40002       2        1
   6   40002       1        4

0人赞添加讨论(0) 举报

randomly sum values from rows and assign them to 2

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间