R text - 'NA', treated as N / A
I have a data frame in R including country codes. The source code for Namibia is "NA". R regards this text "NA" as N / A.
For example, the code below gives me a string with Namibia.
test <- subset(country.info,is.na(country.info$iso.code))
I originally thought it might be a factor issue, so I made sure the iso-code column is a symbol. But it did not help.
How can this be solved?
source to share
This is probably related to how you read the data. Just because this symbol does not mean that yours is "NA"
not NA
, for example:
z <- c("NA",NA,"US")
class(z)
#[1] "character"
You can confirm this by providing us with dput()
(part of) your data.
When you read the data, try changing na.strings = "NA"
(like in read.csv
) to something else and see if it works.
For example with na.strings = ""
:
read.table(text="code country
NA Namibia
GR Germany
FR France", stringsAsFactors=FALSE, header=TRUE, na.strings="")
# code country
# 1 NA Namibia
# 2 GR Germany
# 3 FR France
Make sure using ""
does not change anything else. Alternatively, you can use a string that definitely doesn't appear in your file, like "z_z_z" or something like that. You can replace text=..
with your filename.
source to share
If Thomas's solution doesn't work, you can always use the countrycode package to change the country code to something that is less of a problem. For example, in your case, from ISO2 character to ISO3 character.
country.info$iso.code<-countrycode(country.info$iso.code,"iso2c","iso3c",
warn=TRUE)
(If iso2c causes problems using country.names, hoping the Republic of the Congo and the Democratic Republic of the Congo won't mess up.)
source to share