I am not sure how to loop over each column to replace the NA values with the column mean. When I am trying to replace for one column using the following, it works well.
Column1[is.na(Column1)] <- round(mean(Column1, na.rm = TRUE))
The code for looping over columns is not working:
for(i in 1:ncol(data)){
data[i][is.na(data[i])] <- round(mean(data[i], na.rm = TRUE))
}
the values are not replaced. Can someone please help me with this?
If
DF
is your data frame of numeric columns:ADDED:
Using only the base of R define a function which does it for one column and then lapply to every column:
The last line could be replaced with the following if it's OK to overwrite the input:
lapply
can be used instead of afor
loop.This doesn't really have any advantages over the for loop, though maybe it's easier if you have non-numeric columns as well, in which case
is almost as easy.
There is also quick solution using the imputeTS package: