Concatenate all columns by reference in the data table.
Good day to all,
I would like to concatenate the two data.table
together by reference without writing down all the variables I want to concatenate. Here's a simple example to understand my needs:
> set.seed(20170711)
> (a <- data.table(v_key=seq(1, 5), key="v_key"))
v_key
1: 1
2: 2
3: 3
4: 4
5: 5
> a_backup <- copy(a)
> (b <- data.table(v_key=seq(1, 5), v1=runif(5), v2=runif(5), v3=runif(5), key="v_key"))
v_key v1 v2 v3
1: 1 0.141804303 0.1311052 0.354798849
2: 2 0.425955903 0.3635612 0.950234261
3: 3 0.001070379 0.4615936 0.359660693
4: 4 0.453054854 0.5768500 0.008470552
5: 5 0.951767837 0.1649903 0.565894298
I want to copy all columns b
in a
by reference without specifying the column names.
I could do the following, but that would make a copy of the object for no reason, decreasing the performance of my program and increasing the required RAM:
> (a <- a[b])
v_key v1 v2 v3
1: 1 0.141804303 0.1311052 0.354798849
2: 2 0.425955903 0.3635612 0.950234261
3: 3 0.001070379 0.4615936 0.359660693
4: 4 0.453054854 0.5768500 0.008470552
5: 5 0.951767837 0.1649903 0.565894298
Another option (without a useless copy) is to specify the name of each column b
, which will result in the following:
> a <- copy(a_backup)
> a[b, `:=`(
+ v1=v1,
+ v2=v2,
+ v3=v3
+ )][]
v_key v1 v2 v3
1: 1 0.141804303 0.1311052 0.354798849
2: 2 0.425955903 0.3635612 0.950234261
3: 3 0.001070379 0.4615936 0.359660693
4: 4 0.453054854 0.5768500 0.008470552
5: 5 0.951767837 0.1649903 0.565894298
In a nutshell, I would like to have the efficiency of my second example (no useless copy) without specifying all the column names in b
.
I think I could find a way to do this using a combination of functions colnames()
and get()
, but I'm wondering if there is a cleaner way to do this, the syntax is so important to me.
Thanks everyone!
Lrd
source to share
As you mentioned, the combination colnames
and mget
could be there.
Consider this:
# retrieve the column names from b - without the key ('v_key')
thecols = setdiff(colnames(b), key(b))
# assign them to a
a[b, (thecols) := mget(thecols)]
It's not that bad, is it?
Also, I don't think any other syntax is currently implemented with data.table
. But I would be glad to be wrong :)
source to share
Looking back at the question here , I always like Reduce
this kind of situation:
# provide list of DTs to be merged
arbitrary.dts <- list(...)
a <- Reduce(function(x, y) merge(x, y, all=T,
by=c("v_key")), arbitrary.dts, accumulate=F)
Just one idea (I always like to start with basic functionality). I'm sure there is a lot of slicker during the trip data.table
.
source to share