Odd behavior of shift () function in data.table v1.9.5 (R)

I am using the current development version data.table

(v1.9.5) mainly because it boasts a great built-in feature shift()

.

I noticed that when trying to group expressions within a call data.table

- one of which is a call shift()

- I get some funny behavior from it:

library(data.table)

foo = data.table(x = c(1, 5, 6 ,2, 9, 8))

foo[, y := {
        delta = c(NA, diff(x));
        lag = shift(x, n = 1L, fill = NA);
        list(delta/lag)}]

      

The above attempt at adding y

throws the following error:

Error in delta/lag : non-numeric argument to binary operator

      

So, I check what I get by just creating delta

and lag

without trying to interact with them at all:

foo[, c('delta', 'lag') := 
      list(c(NA, diff(x)),
           shift(x, n = 1L, fill = NA))]
foo
   x delta               lag
1: 1   NA  NA, 1, 5, 6, 2, 9
2: 5    4  NA, 1, 5, 6, 2, 9
3: 6    1  NA, 1, 5, 6, 2, 9
4: 2   -4  NA, 1, 5, 6, 2, 9
5: 9    7  NA, 1, 5, 6, 2, 9
6: 8   -1  NA, 1, 5, 6, 2, 9

      

If I separate the calls, I can get exactly what I want:

foo[, delta := c(NA, diff(x))]
foo[, lag := shift(x, n = 1L, fill = NA)]

foo
   x delta lag
1: 1   NA   NA
2: 5    4    1
3: 6    1    5
4: 2   -4    6
5: 9    7    2
6: 8   -1    9

      

Is this a bug or am I missing something?

EDIT: As Pascal points out, the error in my original example is the result of what the shift()

list returns.

+3


source to share


1 answer


With a recent commit in v1.9.5, shift()

returns a vector to vector input and length(n) == 1

. That is, when the answer is of list

length 1, we return a vector for convenience. This allows us to:

DT[, col := shift(val, type = "lead")] # or "lag"

      

and



DT[, col := valA + shift(valB, type="lead")] # or "lag"

      

In both cases, a vector is returned, and RHS

of :=

, when atomic, is terminated list()

internally for convenience, and this gives the expected behavior.

This closes # 1127 .

+3


source







All Articles