Replace <NA> with NA

I have a data frame containing records; It looks like these values ​​are not considered NA since is.na returns FALSE. I would like to convert these values ​​to NA but could not find a way.

+3


source to share


3 answers


The two classes in which this is likely to be a problem are character and factor. This should loop over the dtaframe and convert the "NA" values ​​to true <NA>

, but only for these two classes:

make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
                                  is.na(x) <- x=="NA"; x} else {
                                  x}
df[] <- lapply(df, make.true.NA)

      



(Untested in the absence of sample data.) Form Usage: df_name[]

Will try to preserve the structure of the original data frame that would otherwise lose its class attribute. I see that ujjwal thinks your NA spelling has "<>" flags, so you can try these functions as more general:

make.true.NA <- function(x) if(is.character(x)||is.factor(x)){
                                  is.na(x) <- x %in% c("NA", "<NA>"); x} else {
                                  x}

      

+1


source


Use dfr[dfr=="<NA>"]=NA

where dfr

is your frame.

For example:



> dfr<-data.frame(A=c(1,2,"<NA>",3),B=c("a","b","c","d"))

> dfr
     A  B
1    1  a
2    2  b
3 <NA>  c
4    3  d

> is.na(dfr)
         A     B
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] FALSE FALSE
[4,] FALSE FALSE

> dfr[dfr=="<NA>"] = NA                 **key step**

> is.na(dfr)
         A     B
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,]  TRUE FALSE
[4,] FALSE FALSE

      

+2


source


You can do it with naniar using replace_with_na

and related functions.


dfr <- data.frame(A = c(1, 2, "<NA>", 3), B = c("a", "b", "c", "d"))

library(naniar)
# dev version - devtools::install_github('njtierney/naniar')
is.na(dfr)
#>          A     B
#> [1,] FALSE FALSE
#> [2,] FALSE FALSE
#> [3,] FALSE FALSE
#> [4,] FALSE FALSE

dfr %>% replace_with_na(replace = list(A = "<NA>")) %>% is.na()
#>          A     B
#> [1,] FALSE FALSE
#> [2,] FALSE FALSE
#> [3,]  TRUE FALSE
#> [4,] FALSE FALSE

# You can also specify how to do this for many variables

dfr %>% replace_with_na_all(~.x == "<NA>")
#> # A tibble: 4 x 2
#>       A     B
#>   <int> <int>
#> 1     2     1
#> 2     3     2
#> 3    NA     3
#> 4     4     4

      

More about usage replace_with_na

here

0


source







All Articles