Data request. Table by key in R

I followed the data.table view. The key is set in the x column of the data.table and then queried. I tried to set the key on the v column and it doesn't work. Any ideas on what I am doing wrong?

> set.seed(34)
> DT = data.table(x=c("b","b","b","a","a"),v=rnorm(5))
> DT
   x          v
1: b -0.1388900
2: b  1.1998129
3: b -0.7477224
4: a -0.5752482
5: a -0.2635815
> setkey(DT,v)
> DT[1.1998129,]
   x          v
1: b -0.7477224  

EXPECTED:
   x          v
1: b  1.1998129

      

+3


source to share


1 answer


When the first argument [.data.table

is a number, it will not do the join, but just look for the line number. Since after setkey

yours data.table

looks like this:

DT
#   x          v
#1: b -0.7477224
#2: a -0.5752482
#3: a -0.2635815
#4: b -0.1388900
#5: b  1.1998129

      

And since it as.integer(1.1998129)

is 1, you get the first line.

Now, if you intended to make a join, you must use the syntax DT[J(...)]

or DT[.(...)]

, and this will work as expected, provided that you use the correct number (as a convenience, you do not need to use J

when working with character columns, because there DT["a"]

is no default values):

DT[J(v[5])]
#   x        v
#1: b 1.199813

      

Please note that DT[J(1.1998129)]

won't work because:



DT$v[5] == 1.1998129
#[1] FALSE

      

You can print a lot of numbers and this will work:

options(digits = 22)
DT$v[5]
#[1] 1.199812896606383683107

DT$v[5] == 1.199812896606383683107
#[1] TRUE

DT[J(1.199812896606383683107)]
#   x                v
#1: b 1.199812896606383683107

      

but there is an additional subtlety here, it's worth noting that R and data.table

have different prefixes when the floating point numbers are equal:

DT$v[5] == 1.19981289660638
#[1] FALSE
DT[J(1.19981289660638)]
#   x                       v
#1: b 1.199812896606379908349

      

In short, be careful when concatenating floating point numbers.

+1


source







All Articles