Why `cumsum` doesn't work within groups or fac

2019-06-24 01:11发布

问题:

Borrowing example from Plotting cumulative counts in ggplot2

x <- data.frame(A=replicate(200,sample(c("a","b","c"),1)),X=rnorm(200))
ggplot(x,aes(x=X,color=A)) + stat_bin(aes(y=cumsum(..count..)),geom="step")

As you can see, cumsum work across groups & facets. I am wondering why it does that? Clearly ..count.. is done within groups, why cumsum is not when applied on to ..count..? Does ggplot internally cat all ..count.. into a vector and then apply cumsum to it?

How to correctly resolve it without pre processing, e.g. using plyr?

And I don't mind geom is not step, it can be line or even bar as long as the graph is a cumulative plot.

回答1:

Here's how I handle this with one line of code (ddply and mutate):

df <- data.frame(x=rnorm(1000),kind=sample(c("a","b","c"),1000,replace=T),
         label=sample(1:5,1000,replace=T),attribute=sample(1:2,1000,replace=T))

dfx <- ddply(df,.(kind,label,attribute),mutate,cum=rank(x)/length(x))

ggplot(dfx,aes(x=x))+geom_line(aes(y=cum,color=kind))+facet_grid(label~attribute)


标签: r ggplot2