Change the shape before and after the data

The following data represent a very small portion of a series of tests before and after treatment. Now my data is as follows:

  Subject Var1 Var2 Var3 Var4
1   A-pre   25   27   23    0
2  A-post   25   26   25  120
3   B-pre   30   28   27  132
4  B-post   30   28   26  140

      

and I need to change it like this:

  Subject Var1.pre Var1.post Var2.pre Var2.post Var3.pre Var3.post Var4.pre Var4.post
1       A       25        25       27        26       23        25        0       120
2       B       30        30       28        28       27        26      132       140

      

I read a lot of questions on SO and package documentation for iterating over data in r, like reshape2, etc., but I couldn't find something like this. Any ideas? Here is the code to replicate the first table:

dat<-structure(list(Subject = structure(c(2L, 1L, 4L, 3L), .Label = c("A-post", 
"A-pre", "B-post", "B-pre"), class = "factor"), Var1 = c(25L, 
25L, 30L, 30L), Var2 = c(27L, 26L, 28L, 28L), Var3 = c(23L, 25L, 
27L, 26L), Var4 = c(0L, 120L, 132L, 140L)), .Names = c("Subject", 
"Var1", "Var2", "Var3", "Var4"), row.names = c(NA, -4L), class = "data.frame")

      

+3


source to share


1 answer


You can use dcast

from devel version data.table

ie. v1.9.5 after splitting the "Subject" column in two, using tstrsplit

c split

as '-'. We use dcast

to reshape "long" to "wide". A function dcast

from data.table

can take several columns value.var

, that is, "Var1" - "Var4".

library(data.table)#v1.9.5+
#convert the data.frame to data.table with `setDT(dat)`
#split the 'Subject' column with tstrsplit and create two columns 
setDT(dat)[, c('Subject', 'New') :=tstrsplit(Subject, '-')]
#change the New column class to 'factor' and specify the levels in order
#so that while using dcast we get the 'pre' column before 'post'
dat[, New:= factor(New, levels=c('pre', 'post'))]
#reshape the dataset
dcast(dat, Subject~New, value.var=grep('^Var', names(dat), value=TRUE),sep=".")
#    Subject Var1.pre Var1.post Var2.pre Var2.post Var3.pre Var3.post Var4.pre
#1:       A       25        25       27        26       23        25        0
#2:       B       30        30       28        28       27        26      132
#   Var4.post
#1:       120
#2:       140

      

NOTE. Devel version installation instructions:here




An option using dplyr/tidyr

would be to split the "Subject" column into two by separate

, convert "wide" format to "long" format using gather

, unite

column "Var" (i.e. from Var1 to Var4) and "New" ("VarNew" ) and spread

"long" format to "wide".

library(dplyr)
library(tidyr)
dat %>% 
   separate(Subject, into=c('Subject', 'New')) %>% #split to two columns
   gather(Var, Val, Var1:Var4)%>% #change from wide to long. Similar to melt
   unite(VarNew, Var, New, sep=".") %>% #unite two columns to form a single
   spread(VarNew, Val)#change from 'long' to 'wide'

      

+3


source







All Articles