Removing NA when multiplying columns

This is a very simple question, but I hope someone can help me avoid unnecessary lines of unnecessary code. I have a simple framework:

Df.1 <- data.frame(A = c(5,4,7,6,8,4),B = (c(1,5,2,4,9,1)),C=(c(2,3,NA,5,NA,9)))

      

What I want to do is create an extra column which is the multiplication of A, B and C, which will then be bound to the original dataframe.

So, I would normally use:

attach(Df.1)
D<-A*B*C

      

But obviously, where NA is in column C, I get NA in variable D. I don't want to exclude all NA rows, but just ignore the NA values ​​in that column (and then the value in D would just be the multiplication of A and B, or where C available, A * B * C.

I know I could just replace NA with 1s so the computation stays the same or uses if statements, but I was busy making this process easier?

Any ideas?

+3


source to share


2 answers


You can use prod

which has an argument na.rm

. To do this, line by line, use apply

:



apply(Df.1,1,prod,na.rm=TRUE)
[1]  10  60  14 120  72  36

      

+3


source


As @James said, prod and apply will work, but you don't have to waste memory storing it in a separate variable, or even bind it

Df.1$D = apply(Df.1, 1, prod, na.rm=T)

      



Assigning a new variable in the dataframe will work.

> Df.1 <- data.frame(A = c(5,4,7,6,8,4),B = (c(1,5,2,4,9,1)),C=(c(2,3,NA,5,NA,9)))
> Df.1
  A B  C
1 5 1  2
2 4 5  3
3 7 2 NA
4 6 4  5
5 8 9 NA
6 4 1  9
> Df.1$D = apply(Df.1, 1, prod, na.rm=T)
> Df.1$D
[1]  10  60  14 120  72  36
> Df.1
  A B  C   D
1 5 1  2  10
2 4 5  3  60
3 7 2 NA  14
4 6 4  5 120
5 8 9 NA  72
6 4 1  9  36

      

+2


source







All Articles