My df1 is as follows:
df1 <- data.frame(A=c("a","b","c","d","e"), B=c("f","g","t","g","u"), C=c("M","NA","NA","NA","M"), D=c("A","NA","NA","NA","NA"), E=c("NA","NA","NA","NA","G"), G=c(1:5))
A B C D E G
1 a f M A NA 1
2 b g NA NA NA 2
3 c t NA NA NA 3
4 d g NA NA NA 4
5 e u M NA G 5
I want to add column based on the readings in column C, D and E. If all are NA, I want to add X to column H. If anyone of them are not NA, I want to add YES to column H. The results is as follows:
A B C D E G H
1 a f M A NA 1 YES
2 b g NA NA NA 2 X
3 c t NA NA NA 3 X
4 d g NA NA NA 4 X
5 e u M NA G 5 YES
Could experts teach me how to do it efficiently with R?
Note that this only works if the
NA
s are actually encoded asNA
rather than the string"NA"
.See
?transform
and?ifelse
for descriptions of those functions. The single&
operator pairs the vector operands item by item and does the boolean operation on each pair, returning a logical vector.For some variety, here is an alternate solution using
apply
that doesn't require callingis.na
on each column separately:with the same caveat that Daniel notes about encoding missing values as
NA
rather than"NA"
.