How to join lists of data that have been split using the plyr function

I used the strip_splits (df) function provided by the plyr package to get a list of dataframes . Now I want to concatenate a list of data frames and add back to the variables used to separate them. The documentation below makes me think it should be possible, however I cannot find a related function.

This is useful when you want to perform some kind of operation on every column in the dataframe, except for the variables that you used to split. These variables will be automatically added to the result when you combine all the results together.

Example:

dfSplit <- dlply(mtcars, c("vs", "am"), strip_splits)
df <- dfSplit[[1]]
score <- function(df) {
  df$score <- apply(apply(df, 2, scale), 1, mean, na.rm = TRUE)
  return(df)
}
dfSplit <- lapply(dfSplit, score) 

      

How do I concatenate the data frames in a dfSplit list together?

Edit: The merged dataframe must have vs and am columns

+3


source to share


3 answers


Using bind_rows()

from dplyr

:

library(dplyr)
bind_rows(dfSplit)

      

Or using the R base:



do.call(rbind, dfSplit)

      

What gives:

#Source: local data frame [32 x 10]
#
#    mpg cyl  disp  hp drat    wt  qsec gear carb       score
#1  18.7   8 360.0 175 3.15 3.440 17.02    3    2 -0.18850120
#2  14.3   8 360.0 245 3.21 3.570 15.84    3    4  0.05315376
#3  16.4   8 275.8 180 3.07 4.070 17.40    3    3 -0.15909455
#4  17.3   8 275.8 180 3.07 3.730 17.60    3    3 -0.14033030
#5  15.2   8 275.8 180 3.07 3.780 18.00    3    3 -0.16788329
#6  10.4   8 472.0 205 2.93 5.250 17.98    3    4  0.42384103
#7  10.4   8 460.0 215 3.00 5.424 17.82    3    4  0.49006288
#8  14.7   8 440.0 230 3.23 5.345 17.42    3    4  0.79264565
#9  15.5   8 318.0 150 2.76 3.520 16.87    3    2 -0.79767163
#10 15.2   8 304.0 150 3.15 3.435 17.30    3    2 -0.53819495
#..  ... ...   ... ...  ...   ...   ...  ...  ...         ...

      

+3


source


You can also use a package rbindlist

from data.table

:



library(data.table)
rbindlist(dfSplit)

      

+3


source


I have since found the plyr ldply function which gives

.id  mpg cyl  disp  hp drat    wt  qsec gear carb       score
1  0.0 18.7   8 360.0 175 3.15 3.440 17.02    3    2 -0.18850120
2  0.0 14.3   8 360.0 245 3.21 3.570 15.84    3    4  0.05315376
3  0.0 16.4   8 275.8 180 3.07 4.070 17.40    3    3 -0.15909455
4  0.0 17.3   8 275.8 180 3.07 3.730 17.60    3    3 -0.14033030
5  0.0 15.2   8 275.8 180 3.07 3.780 18.00    3    3 -0.16788329

      

however the documentation leads me to think that there must be a function that gives a data frame with vs and am columns (not .id)

0


source







All Articles