Sum pairwise rows with R?

2019-01-18 16:57发布

My input is

 df1 <- data.frame(Row=c("row1", "row2", "row3", "row4", "row5"),
                   A=c(1,2,3,5.5,5), 
                   B=c(2,2,2,2,0.5),
                   C= c(1.5,0,0,2.1,3))

It look like this:

#  Row1 1   2   1.5
#  Row2 2   2   0
#  Row3 3   2   0
#  Row4 5.5 2   2.1
#  Row5 5   0.5 3

I want to get the sum of all these pairs of rows, with the following equation. Let's said for Row1 and Row2 pairs: I want to multiply each column's entry and sum them into one final answer, for example-

  • Row1-Row2 answer is (1*2) + (2*2)+ (1.5 *0) = 6
  • Row1-Row3 answer is (1*3) + (2*2) + (1.5*0) = 7

I want to do all analysis for each pairs of row and get a result data frame like this:

row1    row2    6
row1    row3    7
row1    row4    12.65
row1    row5    10.5
row2    row3    10
row2    row4    15
row2    row5    11
row3    row4    20.5
row3    row5    16
row4    row5    34.8

How can I do this with R? Thanks a lot for comments.

标签: r row sum
2条回答
在下西门庆
2楼-- · 2019-01-18 17:55
  1. Create all the combinations you need with combn. t is used to transpose the matrix as you expect it to be formatted.
  2. Use apply to iterate over the indices created in step 1. Note that we use negative indexing so we don't try to sum the Row column.
  3. Bind the two results together.

`

ind <- t(combn(nrow(df1),2))
out <- apply(ind, 1, function(x) sum(df1[x[1], -1] * df1[x[2], -1]))
cbind(ind, out)

           out
[1,] 1 2  6.00
[2,] 1 3  7.00
[3,] 1 4 12.65
 .....
查看更多
放我归山
3楼-- · 2019-01-18 17:56

Yes! This is a matrix multiplication! :-))

First, just to prepare the matrix:

m = as.matrix(df1[,2:4])
row.names(m) = df1$Row

and this is the operation, how easy!

m %*% t(m)

That's it!

One tip - you could define the data.frame this way and it will save you the row.names command:

df1 <- data.frame(row.names=c("row1", "row2", "row3", "row4", "row5"),A=c(1,2,3,5.5,5), B=c(2,2,2,2,0.5), C= c(1.5,0,0,2.1,3))
查看更多
登录 后发表回答