Number of matched items in columns with NA values ​​in R

I am working in R and I have a matrix with values ​​"A", "B" and "NA" and I would like to count the number of values ​​"A" or "B" or NA in each column.

sum (MYDATA [, i] == "A") and sum (MYDATA [, i] == "B") worked fine for columns without NA. For columns containing NA, it is possible to count the number of NA with the sum (is.na (mydata [, i]) but in those columns the sum (mydata [, i] == "A") returns NA as result instead of number. How can I to count the number of "A" values ​​in columns containing NA values?

Thank you for your help!

Example:

> mydata
    V1  V2  V3  V4 
V2 "A" "A" "A" "A"
V3 "A" "A" "A" "A"
V4 "B" "B" NA  NA 
V5 "A" "A" "A" "A"
V6 "B" "A" "A" "A"
V7 "B" "A" "A" "A"
V8 "A" "A" "A" "A"
> sum(mydata[,2]=="A")
[1] 6
> sum(mydata[,3]=="A")
[1] NA
> sum(is.na(mydata[,3]))
[1] 1

      

+3


source to share


6 answers


A function sum

(like many other math functions in R) takes an argument na.rm

. If you set na.rm=TRUE

, R removes all values NA

before performing calculations.

Try:



sum(mydata[,3]=="A", na.rm=TRUE)

      

+5


source


Not sure if this is what you want. RnewB too so check if it works. The difference between the number of lines and the number of lines will tell you the number of NA elements.



colSums(!is.na(mydata))

      

+3


source


To expand on the answer from @Andrie,

mydata <- matrix(c(rep("A", 8), rep("B", 2), rep(NA, 2), rep("A", 4),
  rep(c("B", "A", "A", "A"), 2), rep("A", 4)), ncol = 4, byrow = TRUE)

myFun <- function(x) {
  data.frame(n.A = sum(x == "A", na.rm = TRUE), n.B = sum(x == "B",
    na.rm = TRUE), n.NA = sum(is.na(x)))
}

apply(mydata, 2, myFun)

      

0


source


Another possibility is to convert the column to a coefficient and then use the resume function. Example:

VEC <-c ("A", "B", "A", N. A.)

CV (as.factor (VEC))

0


source


A quick way to do this is to do summary statistics for a variable:

summary (mydata $ my_variable) tables (mydata $ my_variable)

This will give you the number of missing variables.

Hope it helps

0


source


You can use table

to count all your values ​​at once.

-1


source







All Articles