Adding variable to list data.frames using magrittr syntax
Say you have a list of data files that already exist in the environment:
library(magrittr)
lapply(
paste0("z", 2011:2015),
function(x) assign(
x,
data.frame(x=rnorm(10),y=rnorm(10)),
pos = 1
)
)
# should create z2011 through z2015 in your R env
I would like to do this: extract the column, concatenate them into one data.frame, and then add an additional variable to determine where they came from using the magrittr syntax.
I understand that it is something trivial, using other methods (namely ldply(list)
, rbind.fill(listing)
, rbind_all(listing)
, do.call(rbind,...)
). My question is about understanding approaches using syntax magrittr
.
df <-
paste0("z",2011:2015) %>%
lapply(get) %>%
lapply(function(x) extract2(x,"x")) %>%
# what would you do next? Another approach you think is
# more appropriate for magrittr?
I don't know how to add a new variable. For example, I would like to get the following:
do.call(
rbind,
lapply(
paste0("z",2011:2015),
function(x) {
data.frame(x = get(x)$x, year = x)
}
)
)
source to share
I always thought you were getting a magrittr
-diomatic approach by nesting nested calls and turning them inside out. So, doing this to the last snippet you get
paste0("z", 2011:2015) %>%
lapply(function(name) data.frame(x = get(name)$x, year = name)) %>%
do.call(rbind, .)
which looks good to me. I'm not a big fan of breaking all possible statements into x %>% foo1 %>% foo2 %>% ...
, and in this situation it is additionally justified: otherwise, you will have to repeat the call paste0
again to restore the variable names (as suggested in the comment).
source to share
<strong> data
First, I'll make your example a little shorter for better readability.
# creates data.frames z2011, z2012 and z2013, 2 lines each
lapply(
paste0("z", 2011:2013),
function(x) assign(
x,
data.frame(x=rnorm(2),y=rnorm(2)),
pos = 1
)
)
magrittr
+ base
solution
You shouldn't use lapply(get(x))
, use instead mget
. And you should use extract
, not extract2
in lapply
, as you want to save data.frame
.
Then the idiomatic way to assign a column magrittr
is to use inset
or inset2
(same effect here)
So you get:
mget(paste0("z",2011:2015)) %>%
lapply(extract,"x") %>%
Map(inset,.,"year",value = names(.)) %>%
do.call(rbind,.)
# x year
# z2011.1 -0.62124058 z2011
# z2011.2 -2.21469989 z2011
# z2012.1 -0.01619026 z2012
# z2012.2 0.94383621 z2012
# z2013.1 0.91897737 z2013
# z2013.2 0.78213630 z2013
through purrr
magrittr
often used with tidyverse
, using only purrr::map_dfr
, you can write:
library(purrr)
mget(paste0("z",2011:2013)) %>%
map_dfr(~.["x"],.id="year")
# year x
# 1 z2011 -0.62124058
# 2 z2011 -2.21469989
# 3 z2012 -0.01619026
# 4 z2012 0.94383621
# 5 z2013 0.91897737
# 6 z2013 0.78213630
source to share