I've noticed that aggregate()
appears to return its result ordered by the grouping column(s). Is this a guarantee? Can this be relied upon in surrounding logic?
A couple of examples:
set.seed(1); df <- data.frame(group=sample(letters[1:3],10,replace=T),value=1:10);
aggregate(value~group,df,sum);
## group value
## 1 a 16
## 2 b 22
## 3 c 17
And with two groups (notice the second group is ordered first, then the first group to break ties):
set.seed(1); df <- data.frame(group1=sample(letters[1:3],10,replace=T),group2=sample(letters[4:6],10,replace=T),value=1:10);
aggregate(value~group1+group2,df,sum);
## group1 group2 value
## 1 a d 1
## 2 b d 2
## 3 b e 9
## 4 c e 10
## 5 a f 15
## 6 b f 11
## 7 c f 7
Note: I'm asking because I just came up with an answer for Aggregating while merging two dataframes in R which, at least in its current form at the time of writing, depends on aggregate()
returning its result ordered by the grouping column.
Yes, as long as you understand the natural ordering of factors to be by their integer keys. You can see this in the code: