Let's say I have two columns of data. The first contains categories such as "First", "Second", "Third", etc. The second has numbers which represent the number of times I saw "First".
For example:
Category Frequency
First 10
First 15
First 5
Second 2
Third 14
Third 20
Second 3
I want to sort the data by Category and sum the Frequencies:
Category Frequency
First 30
Second 5
Third 34
How would I do this in R?
Using
aggregate
:In the example above, multiple dimensions can be specified in the
list
. Multiple aggregated metrics of the same data type can be incorporated viacbind
:(embedding @thelatemail comment),
aggregate
has a formula interface tooOr if you want to aggregate multiple columns, you could use the
.
notation (works for one column too)or
tapply
:Using this data:
You could use the function
group.sum
from package Rfast.Rfast has many group functions and
group.sum
is one of them.Just to add a third option:
EDIT: this is a very old answer. Now I would recommend the use of group_by and summarise from dplyr, as in @docendo answer.
Several years later, just to add another simple base R solution that isn't present here for some reason-
xtabs
Or if want a
data.frame
backusing
cast
instead ofrecast
(note'Frequency'
is now'value'
)to get: