Retrieving Columns for Non-Null Data
You can change 0
to NA
and then use colMeans
as it has an option for na.rm=TRUE
. In a two-step process, we convert the data items from "0" to "NA" and then get colMeans
it by excluding the items NA
.
is.na(data) <- data==0
colMeans(data, na.rm=TRUE)
# col1 col2
# 2 6
If you need it in one step, we can change the boolean matrix ( data==0
) to NA
and 1 by doing ( NA^
) for the values corresponding to "0" and non-zero elements, and then multiply by the original data so that 1 value changes to the element at that position and NA remained so. We can do colMeans
on this output as above.
colMeans(NA^(data==0)*data, na.rm=TRUE)
# col1 col2
# 2 6
Another option is to use sapply/vapply
. If the dataset is really large, converting to matrix
may not be a good idea as it can cause memory problems. By scrolling through the columns with sapply
or more specific vapply
(will be a little faster) we get mean
non-zero items.
vapply(data, function(x) mean(x[x!=0]), numeric(1))
# col1 col2
# 2 6
Or we can use summarise_each
and specify the function internally funs
after the subset of non-null elements.
library(dplyr)
summarise_each(data, funs(mean(.[.!=0])))
# col1 col2
#1 2 6
source to share