Flattening lists nested in data.frames

Question

Flattening lists nested in data.frames

I come across a lot of lists nested in data.frame columns and don't see any general method for flattening them when possible, i.e. when the nested element could potentially be coherent in the data.frame with the same row count as the parent. Consider these examples of such nests:

require(dplyr)
data_frame(a=1:3, b = c('a','b','c'), c = list('cats','dogs','birds'))
#> # A tibble: 3 x 3
#>       a     b         c
#>   <int> <chr>    <list>
#> 1     1     a <chr [1]>
#> 2     2     b <chr [1]>
#> 3     3     c <chr [1]>
data_frame(a=1:3, b = c('a','b','c'), c = list(iris[1:3,]))
#> # A tibble: 3 x 3
#>       a     b                    c
#>   <int> <chr>               <list>
#> 1     1     a <data.frame [3 x 5]>
#> 2     2     b <data.frame [3 x 5]>
#> 3     3     c <data.frame [3 x 5]>
data_frame(a=1:3, b = c('a','b','c'), c = list(iris[1,], iris[2,], iris[3,]))
#> # A tibble: 3 x 3
#>       a     b                    c
#>   <int> <chr>               <list>
#> 1     1     a <data.frame [1 x 5]>
#> 2     2     b <data.frame [1 x 5]>
#> 3     3     c <data.frame [1 x 5]>

Is there an elegant general way to smooth these out? The closest I've found is this jsonlite::flatten

, which claims to "flatten nested data frames", but can't seem to handle nested lists like in these examples.

+3

r

geotheory Jun 28. 17 at 12:40 am

source to share

1 answer

akrun · Accepted Answer · 2017-06-28T00:43:12+0000

One of the options: unnest

library(tidyr)
data_frame(a=1:3, b = c('a','b','c'), c = list('cats','dogs','birds')) %>%
    unnest
# A tibble: 3 x 3
#     a     b     c
#  <int> <chr> <chr>
#1     1     a  cats 
#2     2     b  dogs
#3     3     c birds


data_frame(a=1:3, b = c('a','b','c'), c = list(iris[1:3,])) %>% 
          unnest
# A tibble: 9 x 7
      a     b Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#  <int> <chr>        <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
#1     1     a          5.1         3.5          1.4         0.2  setosa
#2     1     a          4.9         3.0          1.4         0.2  setosa
#3     1     a          4.7         3.2          1.3         0.2  setosa
#4     2     b          5.1         3.5          1.4         0.2  setosa
#5     2     b          4.9         3.0          1.4         0.2  setosa
#6     2     b          4.7         3.2          1.3         0.2  setosa
#7     3     c          5.1         3.5          1.4         0.2  setosa
#8     3     c          4.9         3.0          1.4         0.2  setosa
#9     3     c          4.7         3.2          1.3         0.2  setosa

data_frame(a=1:3, b = c('a','b','c'), c = list(iris[1,], iris[2,], iris[3,])) %>% 
       unnest
# A tibble: 3 x 7
#      a     b Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#   <int> <chr>        <dbl>       <dbl>        <dbl>       <dbl>  <fctr>
#1     1     a          5.1         3.5          1.4         0.2  setosa
#2     2     b          4.9         3.0          1.4         0.2  setosa
#3     3     c          4.7         3.2          1.3         0.2  setosa

Flattening lists nested in data.frames

More articles: