R text - 'NA', treated as N / A

I have a data frame in R including country codes. The source code for Namibia is "NA". R regards this text "NA" as N / A.

For example, the code below gives me a string with Namibia.

test <- subset(country.info,is.na(country.info$iso.code))

      

I originally thought it might be a factor issue, so I made sure the iso-code column is a symbol. But it did not help.

How can this be solved?

+1


source to share


2 answers


This is probably related to how you read the data. Just because this symbol does not mean that yours is "NA"

not NA

, for example:

z <- c("NA",NA,"US")
class(z)
#[1] "character"

      

You can confirm this by providing us with dput()

(part of) your data.

When you read the data, try changing na.strings = "NA"

(like in read.csv

) to something else and see if it works.



For example with na.strings = ""

:

read.table(text="code country
NA  Namibia
GR  Germany
FR  France", stringsAsFactors=FALSE, header=TRUE, na.strings="")
#   code country
# 1   NA Namibia
# 2   GR Germany
# 3   FR  France

      

Make sure using ""

does not change anything else. Alternatively, you can use a string that definitely doesn't appear in your file, like "z_z_z" or something like that. You can replace text=..

with your filename.

+3


source


If Thomas's solution doesn't work, you can always use the countrycode package to change the country code to something that is less of a problem. For example, in your case, from ISO2 character to ISO3 character.

country.info$iso.code<-countrycode(country.info$iso.code,"iso2c","iso3c",
                                     warn=TRUE)

      



(If iso2c causes problems using country.names, hoping the Republic of the Congo and the Democratic Republic of the Congo won't mess up.)

0


source







All Articles