I want to find the percentage distribution of a numerical value across a given category, but grouped by a second category. For example, suppose I have a data frame with region
, line_of_business
, and sales
, and I want to find the percentage of sales
by line_of_business
, grouped by region
.
I could do this with R's built-in aggregate
and merge
functions but I was curious if there was an shorter way to do this with plyr
's 'ddply
function that avoids an explicit call to merge
.
Here's a way to do it with plyr:
Here's the final result:
How about creating a crosstab and taking proportions?