Remove columns with standard deviation of zero

I want to remove all columns with standard deviation of zero from data.frame.

This does not work:

  df <- df[, ! apply(df , 2 , function(x) sd(x)==0 ) ]

      

I get an error:

undefined selected columns

UPDATE

I chose Filter

as my preferred answer as it also handles NA

s which is very helpful.

For example, in

df <- data.frame(v1=c(0,0,NA,0,0), v2=1:5)

      

column "v1" is removed with Filter

, while methods apply

generate errors.

Through all the other solutions, I learned a lot from them.

UPDATE2:

Application-specific errors can be fixed by appending na.rm = TRUE

to the sd call as follows:

df[, ! apply(df , 2 , function(x) sd(x, na.rm = TRUE)==0 ) ]

      

+3


source to share


3 answers


Use filter:



Filter(function(x) sd(x) != 0, df)

      

+4


source


In addition to @ grrgrrbla's and @akrun's answers with help Filter

, here's the correct way to do what you originally had in mind:

df <- df[, !sapply(df, function(x) { sd(x) == 0} )]

      

Or



df <- df[, sapply(df, function(x) { sd(x) != 0} )]

      

I used sapply()

to get a vector TRUE

when the dataframe column has a standard deviation of 0 and FALSE

otherwise. Then I multiply the original dataframe with this vector.

+5


source


You can just use it Filter

without anonymous function call, since the "SD" of "0" is forced to "FALSE" and everything else is "TRUE" until it Filter

only prints columns that are TRUE

orsd!=0

 Filter(sd, df)

      

Or, if there are mixed class columns, it length(unique)

might be more general.

 df[vapply(df, function(x) length(unique(na.omit(x)))>1, logical(1L))]

      


Or we can use tidyverse

library(tidyverse)
library(magrittr)
df %>% 
   map_lgl(~sd(.) !=0) %>%
   extract(df, .)  

      

data

 df <- structure(list(V1 = c(1, 4, 2, 5), V2 = c(2, 2, 2, 2), V3 = c(3, 
  4, 3, 3), V4 = c(1, 2, 3, 3)), .Names = c("V1", "V2", "V3", "V4"
  ), row.names = c(NA, -4L), class = "data.frame")

      

+3


source







All Articles