R - processing data frames
Suppose I have this dataframe:
df <- data.frame(ID = c("id1", "id1", "id1", "id2", "id2", "id3", "id3", "id3"),
Code = c("A", "B", "C", "A", "B", "A", "C", "D"),
Count = c(34,65,21,3,8,12,15,16), Value = c(3,1,8,2,3,3,5,8))
as follows:
df
ID Code Count Value
1 id1 A 34 3
2 id1 B 65 1
3 id1 C 21 8
4 id2 A 3 2
5 id2 B 8 3
6 id3 A 12 3
7 id3 C 15 5
8 id3 D 16 8
I would like to receive this result data frame:
result <- data.frame(Code = c("A", "B", "C", "D"),
id1_count = c(34,65,21,NA), id1_value = c(3,1,8,NA),
id2_count = c(3, 8, NA, NA), id2_value = c(2, 3, NA, NA),
id3_count = c(12,NA,15,16), id3_value = c(3,NA,5,8))
as follows:
> result
Code id1_count id1_value id2_count id2_value id3_count id3_value
1 A 34 3 3 2 12 3
2 B 65 1 8 3 NA NA
3 C 21 8 NA NA 15 5
4 D NA NA NA NA 16 8
Is there one liner in the base R package that can do this? I can achieve the result I want, but not in R-way (i.e. with loops, etc.). Any help is appreciated. Thank.
source to share
You can try dcast
from the devel data.table
( v1.9.5
) version , which can accept multiple columns value.var
. Installation instructions:here
library(data.table)
dcast(setDT(df), Code~ID, value.var=c('Count', 'Value'))
# Code Count_id1 Count_id2 Count_id3 Value_id1 Value_id2 Value_id3
#1: A 34 3 12 3 2 3
#2: B 65 8 NA 1 3 NA
#3: C 21 NA 15 8 NA 5
#4: D NA NA 16 NA NA 8
Or using reshape
frombase R
reshape(df, idvar='Code', timevar='ID', direction='wide')
# Code Count.id1 Value.id1 Count.id2 Value.id2 Count.id3 Value.id3
#1 A 34 3 3 2 12 3
#2 B 65 1 8 3 NA NA
#3 C 21 8 NA NA 15 5
#8 D NA NA NA NA 16 8
source to share
You can also try:
library(tidyr)
library(dplyr)
df %>%
gather(key, value, -(ID:Code)) %>%
unite(id_key, ID, key) %>%
spread(id_key, value)
What gives:
# Code id1_Count id1_Value id2_Count id2_Value id3_Count id3_Value
#1 A 34 3 3 2 12 3
#2 B 65 1 8 3 NA NA
#3 C 21 8 NA NA 15 5
#4 D NA NA NA NA 16 8
source to share