In the following data frame,
col1 <- c("g1","g2","g3",NA,"g4",NA)
col2 <- c(NA,"a1","a2",NA,"a3","a4")
df1 <-data.frame(col1,col2)
I would like to replace the rows with NA in col1 with corresponding rows of col2. Is it correct to proceed by extracting the rows containing NA by
row <- which(is.na(col1))
and then extract the characters from col2 by
extract <- df1$col2[row]
After this I have no clue how to replace the NAs in col1 with the extracted characters.
Please help!
You don't need which
. Just is.na(df1$col1)
would be sufficient that gives a logical
index. The only problem with the dataset is that both the columns were factor
class based on how you created the data.frame
. It would be better to use stringsAsFactors=FALSE
in the data.frame(..)
as argument to get character
columns. Otherwise, if the levels
in col2
are not present in col1
while replacing, this will give warning
message
# Warning message:
#In `[<-.factor`(`*tmp*`, is.na(df1$col1), value = c(1L, 2L, 3L, :
#invalid factor level, NA generated
Here, I am converting the columns
to character
class before proceeding with the replacement to avoid the above warning.
df1[] <- lapply(df1, as.character)
indx <- is.na(df1$col1)
df1$col1[indx] <- df1$col2[indx]
df1
# col1 col2
#1 g1 <NA>
#2 g2 a1
#3 g3 a2
#4 <NA> <NA>
#5 g4 a3
#6 a4 a4