Link the aggregation object to the source dataframe
I have a data.frame for aggregation that is just done ddply
from plyr. The goal now is to write a function that automatically associates the aggregation object with the original data. The problem is that there can be more than one aggregation variable.
Below is an example with only one aggregation variable:
Here's the data box:
M O
1 1 6
2 2 7
3 2 4
4 1 6
Then with ddply
I get the aggregation for "O":
TEST <- ddply(.data = DF,
.variables = c("M"),
.fun = summarise,
NEW = sum(O))
The result looks like this:
M NEW
1 1 12
2 2 11
Now I want to write a function that allows me to bind a variable " New
" to the original data.frame.
In a loop, it works with:
for(i in 1:nrow(TEST)) {
DF$New[DF$M == TEST$M[i]] <- TEST$NEW[i]
}
M O New
1 1 6 12
2 2 7 11
3 2 4 11
4 1 6 12
Now I want to convert this to a function that gives equivalent output even if there is only one aggregation variable.
source to share
You can use ave
and within
in the R and add more columns as follows. Assuming your data.frame is called "mydf":
within(mydf, {
P <- ave(O, M, FUN = sum)
Q <- ave(O, M, FUN = mean)
})
# M O Q P
# 1 1 6 6.0 12
# 2 2 7 5.5 11
# 3 2 4 5.5 11
# 4 1 6 6.0 12
Of course, the package is even nicer data.table
:
library(data.table)
DT <- data.table(mydf)
DT[, `:=`(SUM = sum(O), MEAN = mean(O)), by = "M"]
DT
M O SUM MEAN
1: 1 6 12 6.0
2: 2 7 11 5.5
3: 2 4 11 5.5
4: 1 6 12 6.0
source to share