可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I want to calculate means over several columns for each row in my dataframe containing missing values, and place results in a new column called 'means.' Here's my dataframe:

df <- data.frame(A=c(3,4,5),B=c(0,6,8),C=c(9,NA,1))
  A B  C
1 3 0  9
2 4 6 NA
3 5 8  1

The code below successfully accomplishes the task if columns have no missing values, such as columns A and B.

 library(dplyr)
 df %>%
 rowwise() %>%
 mutate(means=mean(A:B, na.rm=T))

     A     B     C   means
  <dbl> <dbl> <dbl> <dbl>
1     3     0     9   1.5
2     4     6    NA   5.0
3     5     8     1   6.5

However, if a column has missing values, such as C, then I get an error:

> df %>% rowwise() %>% mutate(means=mean(A:C, na.rm=T))
Error: NA/NaN argument

Ideally, I'd like to implement it with dplyr.

回答1:

df %>% 
  mutate(means=rowMeans(., na.rm=TRUE))

The . is a "pronoun" that references the data frame df that was piped into mutate.

  A B  C    means
1 3 0  9 4.000000
2 4 6 NA 5.000000
3 5 8  1 4.666667

You can also select only specific columns to include, using all the usual methods (column names, indices, grep, etc.).

df %>% 
  mutate(means=rowMeans(.[ , c("A","C")], na.rm=TRUE))

  A B  C means
1 3 0  9     6
2 4 6 NA     4
3 5 8  1     3

回答2:

It is simple to accomplish in base R as well:

cbind(df, "means"=rowMeans(df, na.rm=TRUE))
  A B  C    means
1 3 0  9 4.000000
2 4 6 NA 5.000000
3 5 8  1 4.666667

The rowMeans performs the calculation.and allows for the na.rm argument to skip missing values, while cbind allows you to bind the mean and whatever name you want to the the data.frame, df.

回答3:

Regarding the error in OP's code, we can use the concatenate function c to get those elements as a single vector and then do the mean as mean can take only a single argument.

df %>%
    rowwise() %>% 
    mutate(means = mean(c(A, B, C), na.rm = TRUE))
#     A     B     C    means 
#  <dbl> <dbl> <dbl>    <dbl>
#1     3     0     9 4.000000
#2     4     6    NA 5.000000
#3     5     8     1 4.666667

Also, we can use rowMeans with transform

transform(df, means = rowMeans(df, na.rm = TRUE))
#  A B  C    means
#1 3 0  9 4.000000
#2 4 6 NA 5.000000
#3 5 8  1 4.666667

Or using data.table

library(data.table)
setDT(df)[, means := rowMeans(.SD, na.rm = TRUE)]

R: How to calculate mean for each row with missing

问题:

回答1:

回答2:

回答3:

收藏的人(0)

R: How to calculate mean for each row with missing

问题:

回答1:

回答2:

回答3:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮