R: Add column with condition-check on three column

2019-08-31 01:28发布

问题:

My df1 is as follows:

df1 <- data.frame(A=c("a","b","c","d","e"), B=c("f","g","t","g","u"), C=c("M","NA","NA","NA","M"), D=c("A","NA","NA","NA","NA"), E=c("NA","NA","NA","NA","G"), G=c(1:5))

  A B  C  D  E G
1 a f  M  A NA 1
2 b g NA NA NA 2
3 c t NA NA NA 3
4 d g NA NA NA 4
5 e u  M NA  G 5

I want to add column based on the readings in column C, D and E. If all are NA, I want to add X to column H. If anyone of them are not NA, I want to add YES to column H. The results is as follows:

  A B  C  D  E G H
1 a f  M  A NA 1 YES
2 b g NA NA NA 2 X
3 c t NA NA NA 3 X
4 d g NA NA NA 4 X
5 e u  M NA  G 5 YES

Could experts teach me how to do it efficiently with R?

回答1:

transform(df1, H = ifelse(is.na(C) & is.na(D) & is.na(E), "X", "YES"))

Note that this only works if the NAs are actually encoded as NA rather than the string "NA".

See ?transform and ?ifelse for descriptions of those functions. The single & operator pairs the vector operands item by item and does the boolean operation on each pair, returning a logical vector.



回答2:

For some variety, here is an alternate solution using apply that doesn't require calling is.na on each column separately:

df1$H <- apply(df1[,3:5],1,function(x){if (!all(is.na(x))) "YES" else "X"})

with the same caveat that Daniel notes about encoding missing values as NA rather than "NA".