Aggregate NA in R

I am having problems with handling neural networks when calculating aggregated funds. See the following code:

tab=data.frame(a=c(1:3,1:3), b=c(1,2,NA,3,NA,NA))
tab
  a  b
1 1  1
2 2  2
3 3 NA
4 1  3
5 2 NA
6 3 NA

attach(tab)
aggregate(b, by=list(a), data=tab, FUN=mean, na.rm=TRUE)
  Group.1   x
1       1   2
2       2   2
3       3 NaN

      

I want NA instead of NaN if the vector has all NA, i.e. i want the result to be

  Group.1   x
1       1   2
2       2   2
3       3  NA

      

I tried to use a custom function:

adjmean=function(x) {if(all(is.na(x))) NA else mean(x,na.rm=TRUE)}

      

However, I am getting the following error:

aggregate(b, by=list(a), data=tab, FUN=adjmean)

Error in FUN(X[[1L]], ...) : 
  unused argument (data = list(a = c(1, 2, 3, 1, 2, 3), b = c(1, 2, NA, 3, NA, NA)))

      

Long story short, if the column has all NA's, I want NA to be the output instead of NaN. If he has little NN, then he should calculate the average NN ignore.

Any help would be appreciated.

thank

+3


source to share


2 answers


This is very close to what you had, but replaces mean(x, na.rm=TRUE)

with a custom function that either calculates the non-NA mean or supplies the NA itself:

R> with(tab, 
        aggregate(b, by=list(a), FUN=function(x) 
             if (any(is.finite(z<-na.omit(x)))) mean(z) else NA))
  Group.1  x
1       1  2
2       2  2
3       3 NA
R> 

      



It's really one line, but I broke it down to fit in the SO display.

And you already had a similar idea, but I changed the function a little more to return appropriate values ​​in all cases.

+5


source


There is nothing wrong with your function. What is wrong is that you are using an argument in the default method for aggregate

which does not exist:

adjmean = function(x) {if(all(is.na(x))) NA else mean(x,na.rm=TRUE)}
attach(tab)  ## Just because you did it. I don't recommend this.

## Your error
aggregate(b, by=list(a), data=tab, FUN=adjmean)
# Error in FUN(X[[i]], ...) : 
#   unused argument (data = list(a = c(1, 2, 3, 1, 2, 3), b = c(1, 2, NA, 3, NA, NA)))

## Dropping the "data" argument
aggregate(b, list(a), FUN = adjmean)
#   Group.1  x
# 1       1  2
# 2       2  2
# 3       3 NA

      




If you want to use an argument data

, you must use the method formula

for aggregate

. However, this method treats NA

differently, so you need an additional argument na.action

.

Example:

detach(tab) ## I don't like having things attached
aggregate(b ~ a, data = tab, adjmean)
#   a b
# 1 1 2
# 2 2 2
aggregate(b ~ a, data = tab, adjmean, na.action = na.pass)
#   a  b
# 1 1  2
# 2 2  2
# 3 3 NA

      

+3


source







All Articles