Data.table replacing the NA value

I want to replace 0

with NA

in each column data.table

.

library(data.table)
dt1 <- data.table(V1=0:2, V2=2:0)
dt1

   V1 V2
1:  0  2
2:  1  1
3:  2  0

dt1==0
       V1    V2
[1,]  TRUE FALSE
[2,] FALSE FALSE
[3,] FALSE  TRUE

      

I tried this

dt1[dt1==0] 
Error in `[.data.table`(dt1, dt1 == 0) : 
  i is invalid type (matrix). Perhaps in future a 2 column matrix could return a list of elements of DT (in the spirit of A[B] in FAQ 2.14). Please let datatable-help know if you'd like this, or add your comments to FR #1611.

      

And also tried this

dt1[dt1==0, .SD :=NA] 

      

Any help would be much appreciated. Thanks to

Edited

Partial sessionInfo()

R version 3.2.1 (2015-06-18)
Platform: i686-pc-linux-gnu (32-bit)
Running under: Ubuntu 14.04.2 LTS

data.table_1.9.4

      

+3


source to share


2 answers


You can try set

for multiple columns. It will be faster if you avoid overhead.[data.table

for(j in seq_along(dt1)){
         set(dt1, i=which(dt1[[j]]==0), j=j, value=NA)
}
dt1
#   V1 V2
#1: NA  2
#2:  1  1
#3:  2 NA

      

Or another option would be to loop with lapply

and then change the values 0

to NA withreplace

dt1[, lapply(.SD, function(x) replace(x, which(x==0), NA))]

      

Or we can use some arithmetic operations to convert the value 0 to NA.



 dt1[, lapply(.SD, function(x) (NA^!x) *x)]

      

The method (NA^!x)*x

works by converting !x

that is, the logical TRUE / FALSE vector for each column (where TRUE corresponds to the value 0) to NA

and 1, doing NA^!x

. We multiply by the x value to replace 1 with its corresponding x value, while NA stays that way.

Or syntax similar to base R

would be

  is.na(dt1) <- dt1==0

      

But this method may not be as efficient for large data.table as it dt1==0

will be a boolean matrix and also as @Roland mentions in the comments that the dataset will be copied. I would use either lapply

or the more efficient one set

for large datasets.

+11


source


dt1[dt1==0] <- NA

worked for me.

dt1[dt1==0] <- NA
dt1
##   V1 V2
##1: NA  2
##2:  1  1
##3:  2 NA

      



As Roland pointed out, this makes a copy of the object data.table

and will be slower.

+3


source







All Articles