Combining data into two columns into one column in R

I have two columns in a dataset after merging two separate datasets. I would like to combine these columns into one column, BNR.x.

For the cases listed below, my preferred results are:
1. Nothing. BNR.x has data that fine.
2. Nothing. The data in both columns is the same, which is fine.
3. Data from BNR.y is copied to BNR.x
4. Nothing. Same as 2.
5. Column data is different. It is advisable that I get an extra column with 1 FALSE as a warning on that line.
6. No data available. Preferably I would get a warning here and also notify me that I have no data for this item.

+----+-------+-------+
| ID | BNR.x | BNR.y |
+----+-------+-------+
|  1 | 123   | NA    |
|  2 | 234   | 234   |
|  3 | NA    | 345   |
|  4 | 456   | 456   |
|  5 | 678   | 677   |
|  6 | NA    | NA    |
+----+-------+-------+

      

Is there a way or package that will do this for me?

+3


source to share


3 answers


If your data is in a named dataframe d

, you can do:



## Copy BNR.y if BNR.x is missing
d$BNR.x[is.na(d$BNR.x)] <- d$BNR.y[is.na(d$BNR.x)]
## List the indices of BNR.x that are still missing
which(is.na(d$BNR.x))
## List the indices where BNR.x is different from BNR.y
which(d$BNR.x != d$BNR.y)

      

+1


source


Here's a suggestion. dat

is the name of the data frame:

idx <- is.na(dat$BNR.x) # create logical index for NAs in BNR.x

dat$BNR.x[idx] <- dat$BNR.y[idx] # replace NAs with values from BNR.y

# Add a logical column:
dat <- transform(dat, warn = is.na(BNR.x) | (BNR.x != BNR.y & !is.na(BNR.y)))

      



Result:

  ID BNR.x BNR.y  warn
1  1   123    NA FALSE
2  2   234   234 FALSE
3  3   345   345 FALSE
4  4   456   456 FALSE
5  5   678   677  TRUE
6  6    NA    NA  TRUE

      

+2


source


From:

df
V1  V2  V3
1  1 123  NA
...

df[which(is.na(df$V2)),]$V2 <- df[which(is.na(df$V2)),]$V3
df$warn <- 0
df[which(is.na(df$V2)),]$warn <- 1
df[which(df$V2 != df$V3 & !is.na(df$V3)),]$warn <- 1

      

Ok, overuse and conversion is better, but I have to start somewhere :)

ps. am i wrong or

d$BNR.x[is.na(d$BNR.x)] <- d$BNR.y

      

won't work because it will put the "misaligned" BNR $ y values ​​in line with BNR $ x NAs?

0


source







All Articles