How to check if the NA value is zero; replace if so, leave as NA if not
Edited because it did not include the full set of factors in the sample dataset that caused initial decisions to break.
I am trying to clear a dataset by determining whether to replace NA with 0 or keep it as NA.
Below is an example dataset. "Dom.Supply" must be the sum of the other factors. For example, the NA that appears on lines 3: 5 for "Feed", "Waste", "Processing" and "Other.Uses" can be replaced by 0 as the sum of factors with values (for example, "Food" and "Seed" ) is equal to "Dom.Supply". However, in lines 1 and 2, the BUT should remain, since the sum of "Food" and "Seed" does not equal "Dom.Supply".
Region Country Group Item Year Production Imports Stock.Var Exports Dom.Supply Feed Seed Waste Processing Other.Uses Food
NAm.Oceania Australia Cereals Rye 1961 11 0 0 2 9 NA 1 NA NA NA 7
NAm.Oceania Australia Cereals Rye 1962 10 0 0 3 7 NA 1 NA NA NA 5
NAm.Oceania Australia Cereals Rye 1963 10 0 0 1 9 NA 2 NA NA NA 7
NAm.Oceania Australia Cereals Rye 1964 14 0 -5 0 9 NA 2 NA NA NA 7
NAm.Oceania Australia Cereals Rye 1965 11 0 5 0 16 NA 2 NA NA NA 14
I had to use the function replace
as follows (as an example, but not for the operation I want to use), but this is a simplified replacement for NA with 0s, not the NA = 0 test.
data$AF2 <- 1-((replace(data$Feed, is.na(data$Feed), 0) + (replace(data$Seed,
is.na(data$Seed), 0)) / data$Dom.Supply))
Thank!
source to share
I would use this:
DF <- read.table(text = "Dom.Supply Feed Seed Waste Processing Other.Uses Food
9 NA 1 NA NA NA 7
7 NA 1 NA NA NA 5
9 NA 2 NA NA NA 7
9 NA 2 NA NA NA 7
16 NA 2 NA NA NA 14", header = TRUE)
ix <- rowSums(DF[, -1], na.rm = TRUE) == DF[, 1]
DF[ix,] <- lapply(DF[ix,], function(x) {
x[is.na(x)] <- 0
x
})
# Dom.Supply Feed Seed Waste Processing Other.Uses Food
#1 9 NA 1 NA NA NA 7
#2 7 NA 1 NA NA NA 5
#3 9 0 2 0 0 0 7
#4 9 0 2 0 0 0 7
#5 16 0 2 0 0 0 14
source to share
CathG's answer deserves to be a real answer, not just a comment (posting here as a community wiki):
DF <- read.table(text = "Dom.Supply Feed Seed Waste Processing Other.Uses Food
9 NA 1 NA NA NA 7
7 NA 1 NA NA NA 5
9 NA 2 NA NA NA 7
9 NA 2 NA NA NA 7
16 NA 2 NA NA NA 14", header = TRUE)
ix <- rowSums(DF[, -1], na.rm = TRUE) == DF[, 1]
DF[ix, ][is.na(DF[ix, ])] <- 0
DF
# Dom.Supply Feed Seed Waste Processing Other.Uses Food
#1 9 NA 1 NA NA NA 7
#2 7 NA 1 NA NA NA 5
#3 9 0 2 0 0 0 7
#4 9 0 2 0 0 0 7
#5 16 0 2 0 0 0 14
EDIT
If you have NAs
a column Dom.Supply
, ix
will have NA
for the corresponding row. You can change NA
to FALSE
(how you want to leave NAs
other columns like NAs
):
ix <- rowSums(DF[, -1], na.rm = TRUE) == DF[, 1]
ix[is.na(ix)] <- FALSE
DF[ix, ][is.na(DF[ix, ])] <- 0
Example, with the same original DF
:
DF[2, 1] <- NA
ix <- rowSums(DF[, -1], na.rm = TRUE) == DF[, 1]
ix[is.na(ix)] <- FALSE
DF[ix, ][is.na(DF[ix, ])] <- 0
DF
# Dom.Supply Feed Seed Waste Processing Other.Uses Food
#1 9 NA 1 NA NA NA 7
#2 NA NA 1 NA NA NA 5
#3 9 0 2 0 0 0 7
#4 9 0 2 0 0 0 7
#5 16 0 2 0 0 0 14
source to share