Does collection () guarantee that the result is ordered by the grouping columns?
I noticed that aggregate()
it appears to be returning its result, ordered by the grouping columns. Is this a guarantee? Is it possible to rely on the surrounding logic?
A few examples:
set.seed(1); df <- data.frame(group=sample(letters[1:3],10,replace=T),value=1:10);
aggregate(value~group,df,sum);
## group value
## 1 a 16
## 2 b 22
## 3 c 17
And with two groups (note that the second group is ordered first and then the first group breaks ties):
set.seed(1); df <- data.frame(group1=sample(letters[1:3],10,replace=T),group2=sample(letters[4:6],10,replace=T),value=1:10);
aggregate(value~group1+group2,df,sum);
## group1 group2 value
## 1 a d 1
## 2 b d 2
## 3 b e 9
## 4 c e 10
## 5 a f 15
## 6 b f 11
## 7 c f 7
Note. I ask because I just came up with an answer for Aggregating when merging two dataframes in R , which, at least in its current form at the time of writing, depends on aggregate()
returning its result ordered by the grouping column.
+2
source to share
1 answer
Yes, as long as you understand that the natural order of factors depends on their whole keys. You can see this in the code:
y <- as.data.frame(by, stringsAsFactors = FALSE)
... # y becomes the "integerized" dataframe of index vectors
grp <- rank(do.call(paste, c(lapply(rev(y), ident), list(sep = "."))),
ties.method = "min")
y <- y[match(sort(unique(grp)), grp, 0L), , drop = FALSE]
...
+2
source to share