I am trying to get the cumulative sum of a variable (v) for groups ("a" and "b") within a dataframe. How can I get the result at the bottom -- whose rows are even numbered properly -- into column cs of my dataframe?
> library(nlme)
> g <- factor(c("a","b","a","b","a","b","a","b","a","b","a","b"))
> v <- c(1,4,1,4,1,4,2,8,2,8,2,8)
> cs <- rep(0,12)
> d <- data.frame(g,v,cs)
> d
g v cs
1 a 1 0
2 b 4 0
3 a 1 0
4 b 4 0
5 a 1 0
6 b 4 0
7 a 2 0
8 b 8 0
9 a 2 0
10 b 8 0
11 a 2 0
12 b 8 0
> r=gapply(d,FUN="cumsum",form=~g, which="v")
>r
$a
v
1 1
3 2
5 3
7 5
9 7
11 9
$b
v
2 4
4 8
6 12
8 20
10 28
12 36
> str(r)
List of 2
$ a:'data.frame': 6 obs. of 1 variable:
..$ v: num [1:6] 1 2 3 5 7 9
$ b:'data.frame': 6 obs. of 1 variable:
..$ v: num [1:6] 4 8 12 20 28 36
I guess I could figure out some laborious way to get the data from those dataframes into d$cs, but there's got to be some easy tweak I'm missing.
Another solution using the by function, but I had to order the data first
split<-
is a pretty weird beastleading to
I would use
ave
. If you look at the source ofave
, you'll see it essentially wraps Martin Morgan's solution.My tool of choice for these things is the plyr package: