Replace NA with grouped means in R? [duplicate]

2020-05-09 17:06发布

I am stuck at trying to replace NAs with means and I would be very grateful for help.

I want to replace NAs in multiple columns of a dataframe with the mean of a group within the column. In the example below I would want to replace the NA in x1 with the 14.5, since 13 and 16 are in month 1. The NA in x2 should be replaced with 4.5.

This is the way I tried it:

library(tidyverse)

df <- tibble(x1 = c(13, NA, 16, 17, 16, 12), x2 = c(1, 4, 4, 3, 5, NA),
         month = c(1, 1, 1, 2, 2, 2))

by_month <- group_by(df, month)

for (i in length(df)){
   for (j in nrow(df[[,i]])){
     if(is.na(df[[j, i]])){
      df[[j, i]] <- summarize(by_month[[j, i]],
                                   group_mean = mean(df[[, i]], na.rm=TRUE))
    }
    else{
      df[[j, i]] <- df[[j, i]]
    }
  }
}

However, I just get the Error 'argument "..1" is missing, with no default', which I investigated - but it didn't help. Any help would be great :)

2条回答
ゆ 、 Hurt°
2楼-- · 2020-05-09 17:21

Here is a base R solution using ave, and sapply-ing to each column x1 and x2.

df[1:2] <- sapply(df[1:2], function(x){
  ave(x, df[[3]], FUN = function(.x) {
    .x[is.na(.x)] <- mean(.x, na.rm = TRUE)
    .x
  })
})


df
## A tibble: 6 x 3
#     x1    x2 month
#  <dbl> <dbl> <dbl>
#1  13       1     1
#2  14.5     4     1
#3  16       4     1
#4  17       3     2
#5  16       5     2
#6  12       4     2
查看更多
Ridiculous、
3楼-- · 2020-05-09 17:31

I slightly changed your example, because the data frame you provided had columns of different lengths, but this should solve your problem:

First, I loaded the packages in tidyverse. Then I grouped data by month. The second pipe runs a mutate_all function so it automatically changes all columns.

library(tidyverse)

df <- tibble(x1 = c(13, NA, 16, 17, 16, 12), x2 = c(1, 4, 3, 5, NA, 4),
             month = c(1, 1, 1, 2, 2, 2))


new_df <- df %>%  group_by(month) %>%
  mutate_all(funs(ifelse(is.na(.), mean(., na.rm = TRUE),.)))

Let me know if this is of any help.

查看更多
登录 后发表回答