Adding selected data frames together, from a list

2019-06-24 03:05发布

问题:

I encountered big problem when trying to apply my micro solution to macro scale. I want to write a function that will allow me to automatize adding all values of specific data frames together.

First, I have created list of all data frames:

> lst
$data001
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data002
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data003
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80
 Z   20  40  60  80

$data004
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80
 Z   20  40  60  80
 V   20  40  60  80

$data005
 A   B   C   D   E
 Q   10  30  50  70

$data006
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data007
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data008
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

$data09
 A   B   C   D   E
 X   11  33  55  77
 Y   22  44  66  88

$data010
 A   B   C   D   E
 X   10  30  50  70
 Y   20  40  60  80

Second, I have determined which data frames I would like to add together (add 1 to 1 and 2 to 2 etc.). In this example there are 10 data frames organized in the following order, within lst:

 [1] 1 1 2 2 2 2 2 2 3 2

Manually adding all "ones" I would look something like this:

> ddply(rbind(lst[[1]],lst[[2]]), "A", numcolwise(sum))

 A   B   C   D    E
 X   20  60  100  140
 Y   40  80  120  160

Manually adding all "two" I would look something like this:

 A   B   C   D    E
 X   60  180 300 420
 Y   120 240 360 480
 Z   40  80  120 160
 V   20  40  60  80
 Q   10  30  50  70

However, I just cannot figure it out how write a loop that will create list with, in this example, 3 data frames that are result of summing up selected data frames.

Thank you in advance!

回答1:

We may use data.table

 library(data.table)
 lapply(split(seq_along(lst), v1), function(i) 
         rbindlist(lst[i], fill=TRUE)[
             , lapply(.SD, sum), A, .SDcols= B:E])
#$`1`
#   A  B  C   D   E
#1: X 20 60 100 140
#2: Y 40 80 120 160

#$`2`
#   A   B   C   D   E
#1: X  60 180 300 420
#2: Y 120 240 360 480
#3: Z  40  80 120 160
#4: V  20  40  60  80
#5: Q  10  30  50  70

#$`3`
#   A  B  C  D  E
#1: X 11 33 55 77
#2: Y 22 44 66 88

data

v1 <-  c(1, 1, 2, 2, 2, 2, 2, 2, 3, 2)


标签: r loops plyr