Odd behavior of shift () function in data.table v1.9.5 (R)
I am using the current development version data.table
(v1.9.5) mainly because it boasts a great built-in feature shift()
.
I noticed that when trying to group expressions within a call data.table
- one of which is a call shift()
- I get some funny behavior from it:
library(data.table)
foo = data.table(x = c(1, 5, 6 ,2, 9, 8))
foo[, y := {
delta = c(NA, diff(x));
lag = shift(x, n = 1L, fill = NA);
list(delta/lag)}]
The above attempt at adding y
throws the following error:
Error in delta/lag : non-numeric argument to binary operator
So, I check what I get by just creating delta
and lag
without trying to interact with them at all:
foo[, c('delta', 'lag') :=
list(c(NA, diff(x)),
shift(x, n = 1L, fill = NA))]
foo
x delta lag
1: 1 NA NA, 1, 5, 6, 2, 9
2: 5 4 NA, 1, 5, 6, 2, 9
3: 6 1 NA, 1, 5, 6, 2, 9
4: 2 -4 NA, 1, 5, 6, 2, 9
5: 9 7 NA, 1, 5, 6, 2, 9
6: 8 -1 NA, 1, 5, 6, 2, 9
If I separate the calls, I can get exactly what I want:
foo[, delta := c(NA, diff(x))]
foo[, lag := shift(x, n = 1L, fill = NA)]
foo
x delta lag
1: 1 NA NA
2: 5 4 1
3: 6 1 5
4: 2 -4 6
5: 9 7 2
6: 8 -1 9
Is this a bug or am I missing something?
EDIT: As Pascal points out, the error in my original example is the result of what the shift()
list returns.
source to share
With a recent commit in v1.9.5, shift()
returns a vector to vector input and length(n) == 1
. That is, when the answer is of list
length 1, we return a vector for convenience. This allows us to:
DT[, col := shift(val, type = "lead")] # or "lag"
and
DT[, col := valA + shift(valB, type="lead")] # or "lag"
In both cases, a vector is returned, and RHS
of :=
, when atomic, is terminated list()
internally for convenience, and this gives the expected behavior.
This closes # 1127 .
source to share