Replace specific lines in a data frame

In the next data frame

col1 <- c("g1","g2","g3",NA,"g4",NA)
col2 <- c(NA,"a1","a2",NA,"a3","a4")
df1 <-data.frame(col1,col2)

      

I would like to replace the NA lines in col1 with the corresponding col2 lines. Is it correct to continue fetching lines containing NA by

row <- which(is.na(col1))

      

and then extract the characters from col2 with

extract <- df1$col2[row]

      

After that I don't know how to replace NA in col1 with the extracted characters. Please, help!

+3


source to share


1 answer


You don't need to which

. It is.na(df1$col1)

would just be enough to give an index logical

. The only problem with dataset is that both columns were a factor

class based on how you created data.frame

. It would be better to use it stringsAsFactors=FALSE

in an argument data.frame(..)

as a column character

. Otherwise, if levels

in is col2

not present in the col1

replacement, this will give the messagewarning

# Warning message:
#In `[<-.factor`(`*tmp*`, is.na(df1$col1), value = c(1L, 2L, 3L,  :
#invalid factor level, NA generated

      



Here, I will convert the class columns

to character

before proceeding with the replacement to avoid the above warning.

df1[] <- lapply(df1, as.character)
indx <- is.na(df1$col1)
df1$col1[indx] <- df1$col2[indx]
df1
#  col1 col2
#1   g1 <NA>
#2   g2   a1
#3   g3   a2
#4 <NA> <NA>
#5   g4   a3
#6   a4   a4

      

+5


source







All Articles