How can I remove columns containing NA or variance equal to 0
I want scale
my data before doing PCA, but unfortunately I found that some columns contain NA and some columns have variance 0, I want to delete those columns. This is an example of my data.
df <- data.frame( v1 = 1:10 , v2 = rep( 0 , 10 ) , v3 = sample( c( 1:3 , NA ) , 10 , repl = TRUE ), v4 = 1:10 )
I want to delete columns v2
and v3
at the same time. how can i implement this?
I know how to delete columns containing NA
and then delete the column whose variance is 0.
colsd <- apply(df, 2, sd)
df2 <- df[!is.na(colsd)]
colsd2 <- apply(df2, 2, sd)
df3 <- df2[!colsd2 == 0]
but it looks redundant, I just want to know if I can implement this more efficiently, perhaps in just one line. Thanks for any answer.
+3
source to share