Transform the data frame so that each unique transaction becomes one row

Question

Transform the data frame so that each unique transaction becomes one row

I have a data frame like this:

        trans_id   product_id
1          1          456
2          4          223
3          1          778
4          1          774
5          5          999
6          4          123

I need to convert it so that all trans_ids are listed as one line like this:

trans_id      V1       V2     V3
1            456      778   774
4            223      123
5            999

+3

r dataframe reshape

Cybernetic Apr 20 15 at 16:43

source to share

3 answers

The basic R option would be

reshape(transform(df, N= ave(trans_id, trans_id, FUN=seq_along)), 
               idvar='trans_id', timevar='N', direction='wide')
#   trans_id product_id.1 product_id.2 product_id.3
#1        1          456          778          774
#2        4          223          123           NA
#5        5          999           NA           NA

+3

akrun Apr 20 15 at 16:54

source to share

FROM tidyr

library(tidyr)
t(df %>% spread(trans_id, product_id))

+1 @ Ananda Mahto is responsible for tidy

anddplyr

+1

dimitris_ps Apr 20 15 at 16:48

source to share

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer · 2015-04-20T16:47:33+0000

You must add a secondary identifier. It's easy with getanID

from my splitstackshape package. Since "splitstackshape" also loads "data.table", it is easy to convert it to wide format with dcast.data.table

:

library(splitstackshape)
dcast.data.table(
  getanID(mydf, "trans_id"), 
  trans_id ~ .id, value.var = "product_id")
#    trans_id   1   2   3
# 1:        1 456 778 774
# 2:        4 223 123  NA
# 3:        5 999  NA  NA

An equivalent "dplyr" + "tidyr" approach would be something like this:

library(dplyr)
library(tidyr)

mydf %>%
  group_by(trans_id) %>%
  mutate(id = sequence(n())) %>%
  spread(id, product_id)

Transform the data frame so that each unique transaction becomes one row

More articles: