Link the aggregation object to the source dataframe

I have a data.frame for aggregation that is just done ddply

from plyr. The goal now is to write a function that automatically associates the aggregation object with the original data. The problem is that there can be more than one aggregation variable.

Below is an example with only one aggregation variable:

Here's the data box:

  M O
1 1 6 
2 2 7 
3 2 4 
4 1 6 

      

Then with ddply

I get the aggregation for "O":

TEST <- ddply(.data = DF,
              .variables = c("M"),
              .fun = summarise,
              NEW = sum(O))

      

The result looks like this:

  M NEW
1 1  12
2 2  11

      

Now I want to write a function that allows me to bind a variable " New

" to the original data.frame.

In a loop, it works with:

for(i in 1:nrow(TEST)) {
  DF$New[DF$M == TEST$M[i]] <- TEST$NEW[i]
  } 

  M O New
1 1 6  12 
2 2 7  11 
3 2 4  11 
4 1 6  12 

      

Now I want to convert this to a function that gives equivalent output even if there is only one aggregation variable.

+3


source to share


2 answers


As I said in my comment:



ddply(.data = DF,
      .variables = c("M"),
      .fun = transform,
       NEW = sum(O))
  M O NEW
1 1 6  12
2 1 6  12
3 2 7  11
4 2 4  11

      

+4


source


You can use ave

and within

in the R and add more columns as follows. Assuming your data.frame is called "mydf":

within(mydf, {
  P <- ave(O, M, FUN = sum)
  Q <- ave(O, M, FUN = mean)
})
#   M O   Q  P
# 1 1 6 6.0 12
# 2 2 7 5.5 11
# 3 2 4 5.5 11
# 4 1 6 6.0 12

      



Of course, the package is even nicer data.table

:

library(data.table)
DT <- data.table(mydf)
DT[, `:=`(SUM = sum(O), MEAN = mean(O)), by = "M"]
DT
   M O SUM MEAN
1: 1 6  12  6.0
2: 2 7  11  5.5
3: 2 4  11  5.5
4: 1 6  12  6.0

      

+5


source







All Articles