Merge more than two data frames when assigning an identifier in R
Take this very simple RWE, I want to know which package can be used to automatically assign a factor (dataframe name is preferred) when we combine two or more data.frames
I manually defined the coefficient in the example below and showed the desired result. But I want to automate it as I have over 100 tables to merge. Note that the headers inside each df are constant, only the name itself changes
A <- 1:5
B <- 5:1
df1 <- data.frame(A,B)
A <- 2:6
B <- 6:2
df2 <- data.frame(A,B)
df1$ID <- rep("df1", 5)
df2$ID <- rep("df2", 5)
big_df <- rbind(df1,df2)
source to share
Assuming your data.frame names follow a specific pattern, for example starting with "df" followed by numbers, and they are not listed but just in your global environment, you can use the following:
library(data.table)
bigdf <- rbindlist(Filter(is.data.frame, mget(ls(pattern = "^df\\d+"))), id = "ID")
Without data.table, you can do it like this:
lst <- Filter(is.data.frame, mget(ls(pattern = "^df\\d+")))
bigdf <- do.call(rbind, Map(function(df, id) transform(df, ID=id), lst, names(lst)))
source to share
Consider the following:
library(dplyr)
cof_df <- bind_rows(df1, df2, .id="ID")
cof_df
ID A B
1 1 1 5
2 1 2 4
3 1 3 3
4 1 4 2
5 1 5 1
6 2 2 6
7 2 3 5
8 2 4 4
9 2 5 3
10 2 6 2
And then:
cof_df$ID <- factor(cof_df$ID,
levels = c(1,2),
labels = paste0("df", unique(cof_df$ID)))
performs transcoding.
A similar result can be obtained by naming the arguments in bind_rows
, as in
cof_df <- bind_rows(df1=df1, df2=df2, .id="ID")
source to share