Operations team in R

I have a dataset containing millions of rows and I need to apply a "group by" operation using R.

The data is of the form

V1 V2 V3
a  u  1
a  v  2
b  w  3
b  x  4
c  y  5
c  z  6

      

Doing a "group" with R, I want to add the values ​​in column 3 and concatenate the values ​​in column 2, for example

V1 V2 V3
a uv 3
b wx 7
c yz 11

      

I've tried doing opertaion in excel, but due to the large number of tuples, I cannot use excel. I am new to R, so any help would be appreciated.

+3


source to share


4 answers


Another option with sqldf

 library(sqldf)
 sqldf('select V1,
        group_concat(V2,"") as V2,
        sum(V3) as V3 
        from df 
        group by V1')
 #  V1 V2 V3
 #1  a uv  3
 #2  b wx  7
 #3  c yz 11

      



Or using base R

 do.call(rbind,lapply(split(df, df$V1), function(x) 
  with(x, data.frame(V1=V1[1L], V2= paste(V2, collapse=''), V3= sum(V3)))))

      

+4


source


Many possible solutions, here are two

library(data.table)
setDT(df)[, .(V2 = paste(V2, collapse = ""), V3 = sum(V3)), by = V1]
#    V1 V2 V3
# 1:  a uv  3
# 2:  b wx  7
# 3:  c yz 11

      

or

library(dplyr)
df %>%
  group_by(V1) %>%
  summarise(V2 = paste(V2, collapse = ""), V3 = sum(V3))

# Source: local data table [3 x 3]
# 
#   V1 V2 V3
# 1  a uv  3
# 2  b wx  7
# 3  c yz 11

      




Data

df <- structure(list(V1 = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("a", 
"b", "c"), class = "factor"), V2 = structure(1:6, .Label = c("u", 
"v", "w", "x", "y", "z"), class = "factor"), V3 = 1:6), .Names = c("V1", 
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))

      

+10


source


Another option using aggregate

# Group column 2
ag.2 <- aggregate(df$V2, by=list(df$V1), FUN = paste0, collapse = "")
# Group column 3
ag.3 <- aggregate(df$V3, by=list(df$V1), FUN = sum)

# Merge the two
res <- cbind(ag.2, ag.3[,-1])

      

+4


source


through ddply

library(plyr)
ddply(df, .(V1), summarize, V2 = paste(V2, collapse=''), V3 = sum(V3))

#  V1 V2 V3
#1  a uv  3
#2  b wx  7
#3  c yz 11

      

+2


source







All Articles