Adding variable to list data.frames using magrittr syntax

Say you have a list of data files that already exist in the environment:

library(magrittr)
lapply(
  paste0("z", 2011:2015),
  function(x) assign(
    x, 
    data.frame(x=rnorm(10),y=rnorm(10)),
    pos = 1
  )
)
# should create z2011 through z2015 in your R env

      

I would like to do this: extract the column, concatenate them into one data.frame, and then add an additional variable to determine where they came from using the magrittr syntax.

I understand that it is something trivial, using other methods (namely ldply(list)

, rbind.fill(listing)

, rbind_all(listing)

, do.call(rbind,...)

). My question is about understanding approaches using syntax magrittr

.

df <- 
   paste0("z",2011:2015) %>%
   lapply(get) %>%
   lapply(function(x) extract2(x,"x")) %>%
   # what would you do next? Another approach you think is
   # more appropriate for magrittr?

      

I don't know how to add a new variable. For example, I would like to get the following:

do.call(
  rbind, 
  lapply(
    paste0("z",2011:2015), 
    function(x) {
      data.frame(x = get(x)$x, year = x)
    }
  )
)

      

+3


source to share


2 answers


I always thought you were getting a magrittr

-diomatic approach by nesting nested calls and turning them inside out. So, doing this to the last snippet you get

paste0("z", 2011:2015) %>%
  lapply(function(name) data.frame(x = get(name)$x, year = name)) %>% 
  do.call(rbind, .)

      



which looks good to me. I'm not a big fan of breaking all possible statements into x %>% foo1 %>% foo2 %>% ...

, and in this situation it is additionally justified: otherwise, you will have to repeat the call paste0

again to restore the variable names (as suggested in the comment).

0


source


<strong> data

First, I'll make your example a little shorter for better readability.

# creates data.frames z2011, z2012 and z2013, 2 lines each
lapply(
  paste0("z", 2011:2013),
  function(x) assign(
    x, 
    data.frame(x=rnorm(2),y=rnorm(2)),
    pos = 1
  )
)

      

magrittr

+ base

solution

You shouldn't use lapply(get(x))

, use instead mget

. And you should use extract

, not extract2

in lapply

, as you want to save data.frame

.

Then the idiomatic way to assign a column magrittr

is to use inset

or inset2

(same effect here)



So you get:

mget(paste0("z",2011:2015)) %>%
  lapply(extract,"x") %>%
  Map(inset,.,"year",value = names(.)) %>%
  do.call(rbind,.)

#                   x  year
# z2011.1 -0.62124058 z2011
# z2011.2 -2.21469989 z2011
# z2012.1 -0.01619026 z2012
# z2012.2  0.94383621 z2012
# z2013.1  0.91897737 z2013
# z2013.2  0.78213630 z2013

      

through purrr

magrittr

often used with tidyverse

, using only purrr::map_dfr

, you can write:

library(purrr)
mget(paste0("z",2011:2013)) %>%
  map_dfr(~.["x"],.id="year")

#    year           x
# 1 z2011 -0.62124058
# 2 z2011 -2.21469989
# 3 z2012 -0.01619026
# 4 z2012  0.94383621
# 5 z2013  0.91897737
# 6 z2013  0.78213630

      

0


source







All Articles