Omit inf from row sum in R

2020-02-10 03:38发布

问题:

So I am trying to sum the rows of a matrix, and there are inf's within it. How do I sum the row, omitting the inf's?

回答1:

Multiply your matrix by the result of is.finite(m) and call rowSums on the product with na.rm=TRUE. This works because Inf*0 is NaN.

m <- matrix(c(1:3,Inf,4,Inf,5:6),4,2)
rowSums(m*is.finite(m),na.rm=TRUE)


回答2:

A[is.infinite(A)]<-NA
rowSums(A,na.rm=TRUE)

Some benchmarking for comparison:

library(microbenchmark)


rowSumsMethod<-function(A){
 A[is.infinite(A)]<-NA
 rowSums(A,na.rm=TRUE)
}
applyMethod<-function(A){
 apply( A , 1 , function(x){ sum(x[!is.infinite(x)])})
}

rowSumsMethod2<-function(m){
  rowSums(m*is.finite(m),na.rm=TRUE) 
}

rowSumsMethod0<-function(A){
 A[is.infinite(A)]<-0
 rowSums(A)
}

A1 <- matrix(sample(c(1:5, Inf), 50, TRUE), ncol=5)
A2 <- matrix(sample(c(1:5, Inf), 5000, TRUE), ncol=5)
microbenchmark(rowSumsMethod(A1),rowSumsMethod(A2),
               rowSumsMethod0(A1),rowSumsMethod0(A2),
               rowSumsMethod2(A1),rowSumsMethod2(A2),
               applyMethod(A1),applyMethod(A2))

Unit: microseconds
               expr      min        lq    median        uq      max neval
  rowSumsMethod(A1)   13.063   14.9285   16.7950   19.3605 1198.450   100
  rowSumsMethod(A2)  212.726  220.8905  226.7220  240.7165  307.427   100
 rowSumsMethod0(A1)   11.663   13.9960   15.3950   18.1940  112.894   100
 rowSumsMethod0(A2)  103.098  109.6290  114.0610  122.9240  159.545   100
 rowSumsMethod2(A1)    8.864   11.6630   12.5960   14.6955   49.450   100
 rowSumsMethod2(A2)   57.380   60.1790   63.4450   67.4100   81.172   100
    applyMethod(A1)   78.839   84.4380   92.1355   99.8330  181.005   100
    applyMethod(A2) 3996.543 4221.8645 4338.0235 4552.3825 6124.735   100

So Joshua's method wins! And apply method is clearly slower than two other methods (relatively speaking of course).



回答3:

I'd use apply and is.infinite in order to avoid replacing Inf values by NA as in @Hemmo's answer.

> set.seed(1)
> Mat <- matrix(sample(c(1:5, Inf), 50, TRUE), ncol=5)
> Mat # this is an example
      [,1] [,2] [,3] [,4] [,5]
 [1,]    2    2  Inf    3    5
 [2,]    3    2    2    4    4
 [3,]    4    5    4    3    5
 [4,]  Inf    3    1    2    4
 [5,]    2    5    2    5    4
 [6,]  Inf    3    3    5    5
 [7,]  Inf    5    1    5    1
 [8,]    4  Inf    3    1    3
 [9,]    4    3  Inf    5    5
[10,]    1    5    3    3    5
> apply(Mat, 1, function(x) sum(x[!is.infinite(x)]))
 [1] 12 15 21 10 18 16 12 11 17 17


回答4:

Try this...

m <- c( 1 ,2 , 3 , Inf , 4 , Inf ,5 )
sum(m[!is.infinite(m)])

Or

m <- matrix( sample( c(1:10 , Inf) , 100 , rep = TRUE ) , nrow = 10 )
sums <- apply( m , 1 , FUN = function(x){ sum(x[!is.infinite(x)])})

> m
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    8    9    7  Inf    9    2    2    6    1   Inf
 [2,]    8    7    4    5    9    5    8    4    7    10
 [3,]    7    9    3    4    7    3    3    6    9     4
 [4,]    7  Inf    2    6    4    8    3    1    9     9
 [5,]    4  Inf    7    5    9    5    3    5    9     9
 [6,]    7    3    7  Inf    7    3    7    3    7     1
 [7,]    5    7    2    1  Inf    1    9    8    1     5
 [8,]    4  Inf   10  Inf    8   10    4    9    7     2
 [9,]   10    7    9    7    2  Inf    4  Inf    4     6
[10,]    9    4    6    3    9    6    6    5    1     8

> sums
 [1] 44 67 55 49 56 45 39 54 49 57


回答5:

This is a "non-apply" and non-destructive approach:

rowSums( matrix(match(A, A[is.finite(A)]), nrow(A)), na.rm=TRUE)
[1] 2 4

Although it is reasonably efficient, it is not as fast as Johsua's multiplication method.



标签: r rowsum