Replace specific lines in a data frame
In the next data frame
col1 <- c("g1","g2","g3",NA,"g4",NA)
col2 <- c(NA,"a1","a2",NA,"a3","a4")
df1 <-data.frame(col1,col2)
I would like to replace the NA lines in col1 with the corresponding col2 lines. Is it correct to continue fetching lines containing NA by
row <- which(is.na(col1))
and then extract the characters from col2 with
extract <- df1$col2[row]
After that I don't know how to replace NA in col1 with the extracted characters. Please, help!
source to share
You don't need to which
. It is.na(df1$col1)
would just be enough to give an index logical
. The only problem with dataset is that both columns were a factor
class based on how you created data.frame
. It would be better to use it stringsAsFactors=FALSE
in an argument data.frame(..)
as a column character
. Otherwise, if levels
in is col2
not present in the col1
replacement, this will give the messagewarning
# Warning message:
#In `[<-.factor`(`*tmp*`, is.na(df1$col1), value = c(1L, 2L, 3L, :
#invalid factor level, NA generated
Here, I will convert the class columns
to character
before proceeding with the replacement to avoid the above warning.
df1[] <- lapply(df1, as.character)
indx <- is.na(df1$col1)
df1$col1[indx] <- df1$col2[indx]
df1
# col1 col2
#1 g1 <NA>
#2 g2 a1
#3 g3 a2
#4 <NA> <NA>
#5 g4 a3
#6 a4 a4
source to share